The content on this page was provided by an independent third party and syndicated by XPR Media. Members of the editorial and news staff of the USA TODAY Network were not involved in the creation of this content.

AI Built for Law Outperforms ChatGPT, Claude, and Gemini on Legal Reasoning Benchmark

DescrybeLM answered all 200 bar exam questions correctly. ChatGPT, Claude, and Gemini each missed between 13 and 23—and scored lower on legal reasoning quality.

We had a thesis that purpose-built legal AI produces meaningfully different results. Legal professionals deserve evidence. So we tested ourselves and published our methodology for anyone to replicate.”
— Kara Peterson, Co-Founder and CEO of Descrybe

BOSTON, MA, UNITED STATES, March 5, 2026 /EINPresswire.com/ — When AI gets a legal question wrong, the most dangerous failure isn’t an obvious error. It’s an answer that sounds authoritative: fluent, confident, well-structured, and yet applying the wrong legal standard. The error reads like competent lawyering.

Today, Descrybe launched DescrybeLM — an AI system built specifically for legal reasoning — and published a white paper with benchmark data to show what that difference looks like in practice.

Descrybe ran a controlled benchmark against ChatGPT 5.2, Claude Opus 4.5, and Gemini 3 Pro on 200 multistate bar exam questions. The study measured not just whether each system chose the correct answer, but whether the legal reasoning behind it was sound: Did it identify the right rule? Apply it correctly to the facts? Avoid the traps that produce persuasive but wrong analysis?

“We had a thesis that purpose-built legal AI produces meaningfully different results for legal reasoning tasks. Legal professionals deserve to make tool decisions based on real evidence. So we tested ourselves, published our methodology, and invite anyone to replicate it,” said Kara Peterson, Co-Founder and CEO of Descrybe.

What the benchmark showed

All four systems were tested under standardized, no-external-web conditions using the NCBE MBE Complete Practice Exam (Questions 1–200, no exclusions), producing 800 separate evaluation runs with blinded scoring.

When general-purpose models were wrong, they were confidently wrong. Among 52 incorrect outputs, 49 delivered assertive, well-structured reasoning that did not signal uncertainty — the failure mode that imposes the highest verification burden on practitioners. The dominant patterns were applying the wrong legal standard or misapplying the correct one, while the prose read like competent analysis.

Two models — Claude Opus 4.5 and Gemini 3 Pro — exhibited overconfident tone on correct outputs as well as incorrect ones. DescrybeLM and ChatGPT 5.2 received zero overconfidence flags across all 200 outputs. A system that sounds equally confident whether it is right or wrong gives practitioners no reliable signal from tone alone.

The study also found that cross-checking between general-purpose models is not a reliable substitute for getting the answer right. Across 200 questions, 40 were missed by at least one model, 11 by two or more, and only 1 by all three — meaning errors were largely unpredictable and non-overlapping.

What’s behind the results

DescrybeLM is built on a curated primary-law corpus of more than 100 million structured records, requiring more than 100 billion tokens of preparation.
“Most AI tools are built for general use and adapted for law. DescrybeLM was built differently: from the foundation up, specifically for legal reasoning, on more than 100 million structured records individually cleaned and organized for that purpose. That kind of data work is painstaking and takes years — but it’s the difference between a system that sounds right and one that is right,” said Richard DiBona, Co-Founder and CTO of Descrybe.

Why this matters

The headline problem in legal AI isn’t systems that obviously fail. It’s systems that fail invisibly, confidently, and in a way that reads like competent analysis. In a crowded market, sounding right is easy to mistake for being right. Legal professionals need real evidence to decide which tools to use for which purposes — which is why Descrybe published its methodology and invites independent replication.

“It’s rare to see something that genuinely stops you in your tracks. When I saw DescrybeLM answer all 200 multistate bar exam questions correctly while ChatGPT, Claude, and Gemini each missed double digits — that’s not a marginal difference. That’s a different category of tool,” said Ken Friedman, legal technology pioneer and advisor to Descrybe.

The full white paper, Beyond Confidently Wrong: How Purpose-Built AI Mitigates Legal Reasoning’s Hidden Risk, is available now.

Kara Peterson
Descrybe
+1 617-752-2020
email us here
Visit us on social media:
LinkedIn
YouTube

Descrybe demo

Legal Disclaimer:

EIN Presswire provides this news content “as is” without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact pressreleases@xpr.media

Global UV Paint Manufacturers Step Up as Demand for High-Performance Coatings Grows Across Industries

Global UV Paint Manufacturers Step Up as Demand for High-Performance Coatings Grows Across Industries

HUZHOU CITY, ZHEJIANG PROVINCE, CHINA, March 16, 2026 /EINPresswire.com/ — The UV paint market has seen steady growth

March 16, 2026

How Top Garlic In Brine Manufacturers Continue to Drive Growth in the Global Preserved Food Market

How Top Garlic In Brine Manufacturers Continue to Drive Growth in the Global Preserved Food Market

LONGHAI CITY, FUJIAN PROVINCE, CHINA, March 16, 2026 /EINPresswire.com/ — The global market for preserved vegetables

March 16, 2026

Nest Industry Launches ‘ECG EGG’ on Kickstarter: A Revolutionary Chest-Patch Shield Against Emotional Burnout

Nest Industry Launches ‘ECG EGG’ on Kickstarter: A Revolutionary Chest-Patch Shield Against Emotional Burnout

A new generation of personal stress management technology using medical-grade HRV sensing and AI-driven emotional

March 16, 2026

Kuvings Celebrates St. Patrick’s Day with Playful Ad

Kuvings Celebrates St. Patrick’s Day with Playful Ad

Premium kitchen appliance brand Kuvings has released a new AI-generated video celebrating St. Patrick’s Day. WHEELING,

March 16, 2026

Amlogic Launches Gaming Monitor Scaler SOCs, M602X1 and M603X1, Expanding into the Mid-to-High-End Display Chip Market

Amlogic Launches Gaming Monitor Scaler SOCs, M602X1 and M603X1, Expanding into the Mid-to-High-End Display Chip Market

MOUNTAIN VIEW, CA, UNITED STATES, March 16, 2026 /EINPresswire.com/ — Amlogic, a global leader in innovative SoC

March 16, 2026

AwareNow and Kinomap Launch Global Virtual 5K for Multiple Sclerosis During National MS Awareness Month

AwareNow and Kinomap Launch Global Virtual 5K for Multiple Sclerosis During National MS Awareness Month

Genentech is the Official Sponsor of the Because I Can Virtual 5K. $5 from every registration will be donated to the

March 16, 2026

Cinnamon Dhonveli Maldives strengthens reef conservation through coral restoration initiative

Cinnamon Dhonveli Maldives strengthens reef conservation through coral restoration initiative

MALDIVES, March 16, 2026 /EINPresswire.com/ — The vibrant coral reefs surrounding Cinnamon Dhonveli Maldives play a

March 16, 2026

MultiRater Surveys launches AI Leadership Coach inside MyMentor Insights – Delivering Scalable Leadership Development

MultiRater Surveys launches AI Leadership Coach inside MyMentor Insights – Delivering Scalable Leadership Development

MyMentor Insights enables HR and L&D teams to deliver leadership programs at scale without losing depth, trust or

March 16, 2026

Palm Beach Luxury Chauffeur Service Highlights Transportation Demand Ahead of Boat Show Season

Palm Beach Luxury Chauffeur Service Highlights Transportation Demand Ahead of Boat Show Season

Travel disruptions and rideshare cancellations highlight the growing demand for professional chauffeur service during

March 16, 2026

AI Job Search Tools Launches Automated Application Platform

AI Job Search Tools Launches Automated Application Platform

A new platform streamlines the US job hunt by optimizing resumes for ATS and auto-submitting applications, saving

March 16, 2026

iFLY Opens New Indoor Skydiving Location in Miami

iFLY Opens New Indoor Skydiving Location in Miami

The new Miami facility offers indoor skydiving experiences, STEM field trips, and private events using the latest

March 16, 2026

Chinese Top 3 MPO Patch Cord Manufacturers in 2026: Drive Industry Leadership in High-Density Optical Interconnection

Chinese Top 3 MPO Patch Cord Manufacturers in 2026: Drive Industry Leadership in High-Density Optical Interconnection

Delivering reliable high-density optical interconnect solutions for hyperscale data centers, 5G networks, and

March 16, 2026

GCE Global Solutions Achieves Triple ISO Certification for Global Payroll and EOR

GCE Global Solutions Achieves Triple ISO Certification for Global Payroll and EOR

Serving clients in 132 jurisdictions worldwide, GCE strengthens global payroll and EOR services with enterprise-grade

March 16, 2026

Aquaponics USA Celebrates National Ag Week With A Lettuce Wall Give Away

Aquaponics USA Celebrates National Ag Week With A Lettuce Wall Give Away

Celebrating National Ag Week, March 15-21, by Giving Away the most popular vegetable in the U.S., Lettuce Seeing the

March 16, 2026

THINKWARE Launches U3000 PRO in Australia, Delivering Next-Gen Flagship Performance with Automotive-Grade Reliability

THINKWARE Launches U3000 PRO in Australia, Delivering Next-Gen Flagship Performance with Automotive-Grade Reliability

Dual Sony STARVIS 2 sensors, RADAR-based parking surveillance, and modular connectivity set a new benchmark for premium

March 16, 2026

The Ultimate PT, OT & SLP Summit 2026 Delivers Breakthrough Strategies to Private Practice Leaders in Phoenix

The Ultimate PT, OT & SLP Summit 2026 Delivers Breakthrough Strategies to Private Practice Leaders in Phoenix

Rockstar Summit 2026 brought together private practice leaders for three transformative days focused on recruiting,

March 16, 2026

The Fiduciary Liquidity Problem: Why Real Estate Creates Planning Risks for High-Net-Worth Estates

The Fiduciary Liquidity Problem: Why Real Estate Creates Planning Risks for High-Net-Worth Estates

The Fiduciary Liquidity Problem: Why Real Estate Creates Planning Risks for High-Net-Worth Estates LOS ANGELES, CA,

March 16, 2026

Creative Biolabs’ New AI Platform Accelerates CAR-T and Antibody Humanization

Creative Biolabs’ New AI Platform Accelerates CAR-T and Antibody Humanization

Creative Biolabs today announced the upgrade of its next-generation drug discovery pipeline with integrated AI-driven

March 16, 2026

Who Are the Top Glass Skincare Bottle Set Manufacturers and What Sets Them Apart

Who Are the Top Glass Skincare Bottle Set Manufacturers and What Sets Them Apart

GUANGZHOU CITY, GUANGDONG PROVINCE, CHINA, March 16, 2026 /EINPresswire.com/ — The global skincare packaging market

March 16, 2026

Top Automotive Parts Manufacturers Driving Global Supply Chain Innovation

Top Automotive Parts Manufacturers Driving Global Supply Chain Innovation

QINHUANGDAO CITY, HEBEI PROVINCE, CHINA, March 16, 2026 /EINPresswire.com/ — The global automotive parts manufacturing

March 16, 2026

Top Bistro Furniture Manufacturers Push Innovation as Global Outdoor Dining Demand Grows

Top Bistro Furniture Manufacturers Push Innovation as Global Outdoor Dining Demand Grows

HANGZHOU CITY, ZHEJIANG PROVINCE, CHINA, March 16, 2026 /EINPresswire.com/ — The global bistro furniture industry is

March 16, 2026

Industry Leaders Call for NFPA Standards Update to Reflect Modern Fire Risks and Commercial Kitchen Demands

Industry Leaders Call for NFPA Standards Update to Reflect Modern Fire Risks and Commercial Kitchen Demands

In a city where skyscrapers meet curbside diners, commercial cooking is on the rise and industry experts say our fire

March 16, 2026

How Top Bag Organizer Manufacturers Are Adapting to Rising Global Demand

How Top Bag Organizer Manufacturers Are Adapting to Rising Global Demand

BAODING CITY, HEBEI PROVINCE, CHINA, March 16, 2026 /EINPresswire.com/ — The bag organizer market has expanded

March 16, 2026

Riad Tile Highlights Growing Design Influence of Kitchen Tile in Modern Home Renovations

Riad Tile Highlights Growing Design Influence of Kitchen Tile in Modern Home Renovations

March 16, 2026 – PRESSADVANTAGE – Riad Tile, a Dallas-based tile company specializing in handcrafted and artisan tiles,

March 16, 2026

Over 10,000 Memphis Properties Are Five or More Years Behind on Taxes

Over 10,000 Memphis Properties Are Five or More Years Behind on Taxes

As-Is Home Buyer – Memphis provides cash offers in as few as seven days to Memphis homeowners facing foreclosure, tax

March 16, 2026

mxHERO Announces In-Zone Japan Data Residency for mxHERO Secure Share on AWS

mxHERO Announces In-Zone Japan Data Residency for mxHERO Secure Share on AWS

Tokyo-based deployment enables Japanese enterprises and public sector agencies to securely manage and store email

March 16, 2026

K2 Corporate Mobility’s growth leads the company to rebrand

K2 Corporate Mobility’s growth leads the company to rebrand

This rebrand gives our set of service offerings the opportunity to really spread their wings. To develop their own

March 16, 2026

INFINITI HR Releases 2026 Employer Compliance Guide to Help Businesses Navigate Changing Workforce Regulations

INFINITI HR Releases 2026 Employer Compliance Guide to Help Businesses Navigate Changing Workforce Regulations

A practical roadmap for employers to stay compliant, reduce risk, and prepare for evolving labor laws in 2026.

March 16, 2026

Research Highlights Leadership Development Programs Used by Organizations to Strengthen Team Performance

Research Highlights Leadership Development Programs Used by Organizations to Strengthen Team Performance

Comparison outlines how leadership development providers approach communication, collaboration and organizational

March 16, 2026

Dr. King Beloved Community Program Rescheduled for March 18 in Philadelphia

Dr. King Beloved Community Program Rescheduled for March 18 in Philadelphia

An event, hosted by Paul Robeson House and Museum and University of Pennsylvania’s MLK Symposium, has moved to March

March 15, 2026

New York Art Life Announces Interview: NYBG Researcher and Former Humanities Institute Director Dr. Lucas Mertehikian

New York Art Life Announces Interview: NYBG Researcher and Former Humanities Institute Director Dr. Lucas Mertehikian

How the visionary scholar bridges botanical history, Latin American literature, and visual arts to redefine our

March 15, 2026

In The Garden Tells A Powerful Story Of Faith, Courage, And Purpose During World War II

In The Garden Tells A Powerful Story Of Faith, Courage, And Purpose During World War II

Christal Kahles-Jones shares an inspiring novel about resilience, divine calling, and a young woman’s determination to

March 15, 2026

DigitalXForce Partners with Regulators and Cyber Insurers to Advance ‘Digital Trust Score’ for Cyber Risk Certification

DigitalXForce Partners with Regulators and Cyber Insurers to Advance ‘Digital Trust Score’ for Cyber Risk Certification

A groundbreaking framework to quantify cyber resilience and transform how enterprises measure, insure, and certify

March 15, 2026

High Class Granite Identifies the Most In-Demand Countertop Edges and Finishes for Spring 2026 Renovations

High Class Granite Identifies the Most In-Demand Countertop Edges and Finishes for Spring 2026 Renovations

Florida's Trusted Countertop Experts Share What Homeowners Are Choosing This Season ORLANDO, FL, UNITED STATES, March

March 15, 2026

STACK Cybersecurity Earns GTIA Advancing Diversity in Technology Leadership Award

STACK Cybersecurity Earns GTIA Advancing Diversity in Technology Leadership Award

Livonia-based managed security firm recognized by global IT channel peers Degrees or certificates rarely tell you what

March 15, 2026

Monolith Technologies Launches ShopSavvy Desktop, a Commerce Focused Agent for Product Research and Smart Deal Discovery

Monolith Technologies Launches ShopSavvy Desktop, a Commerce Focused Agent for Product Research and Smart Deal Discovery

Desktop is an agentic app for the shopping vertical with product intelligence, live pricing, smarter deal

March 15, 2026

Lyzr AI Raises Series A+ at $250 Million Valuation

Lyzr AI Raises Series A+ at $250 Million Valuation

NEW JERSEY, NJ, UNITED STATES, March 15, 2026 /EINPresswire.com/ — Lyzr AI, the full-stack agent infrastructure

March 15, 2026

Building for Tomorrow: How Canadian Real Estate Must Adapt to a Changing Climate

Building for Tomorrow: How Canadian Real Estate Must Adapt to a Changing Climate

TORONTO, ON / ACCESS Newswire / March 15, 2026 / Canada is facing a new reality. Flooding in the Fraser Valley.

March 15, 2026

The Algorithmic Gap in Modern Law: Attorney Luca De Pauli on the Responsibility Crisis in the Age of AI

The Algorithmic Gap in Modern Law: Attorney Luca De Pauli on the Responsibility Crisis in the Age of AI

A Vision for the "Law of the Future" from a de jure condendo perspective, outlines three non-negotiable pillars for AI

March 15, 2026

The ‘Techistential’ Moment: Top Ranked Futurist Roger Spitz Joins Global Luminaries at DLIC 2026 in Trinidad

The ‘Techistential’ Moment: Top Ranked Futurist Roger Spitz Joins Global Luminaries at DLIC 2026 in Trinidad

Disruptive Futures Institute Chair Brings Breakout Bestseller Frameworks to Caribbean’s Premier Leadership Innovation

March 15, 2026