How I Discovered CVE-2025-21311: The Breakdown and Why It Matters

Late last year (2024), while testing the new-at-the-time Windows Server 2025, I had set up Active Directory (AD) with Network Policy Server (NPS) and configured RADIUS using MSCHAPv2 for authentication. Everything seemed normal, until I noticed something that didn’t make sense.


The Discovery

Shortly after setting up the new test environment and configuring network devices to use NPS for authentication, I discovered that some devices (MikroTik RouterOS) would allow SSH sessions with only a username (the password prompted would not appear). After trying different RouterOS firmware versions, I realized that wasn’t the full story.

My next step was sniffing the communication between RouterOS and NPS with Wireshark. I found that the NPS server would always respond with an “Access-Accept” flag, even when the password was wrong. The only requirement? The username had to be valid and authorized to access the resource.

During this time I did some research on the “MSCHAPv2 over RADIUS” protocol stack and learned that MSCHAPv2 specification includes a mandatory redundant check on the RADIUS client side. This check calculates and compares some sort of hash or signature in the Access-Accept packet to confirm the RADIUS server truly validated and knows the password.

Here’s the kicker: RouterOS wasn’t doing that check. It was blindly trusting the Accept packet. That’s bad, but not the worst part.


The Real Problem

At this point I was curious what the AD Domain Controller (DC) was showing in Event Viewer log from these authentication attempts. I was shocked to see that the DC had actually been reporting the bad credentials as “good”.

In other words, the DC was telling NPS that the credentials were valid even when they weren’t.

This wasn’t just a RouterOS bug, It was a logic flaw in the DC’s authentication process; and when I tested NPS on Server 2022 with a DC running Server 2025, the issue persisted. That meant the vulnerability wasn’t isolated, it was systemic in AD.


Why This Matters

NTLM isn’t just used for communications between NPS and a DC, many applications, protocols and even Windows itself use NTLM. If a DC is responding to authentication requests with a “green light” every time, the entire authentication chain collapses. Here’s the breakdown:

  • NPS was correct: It trusted the DC, as expected.
  • RouterOS was wrong: It skipped the integrity check on the Access-Accept packet. Although I later found out that this is by design, there is a configuration option to enforce this check.
  • DC 2025 was critically wrong: It approved invalid passwords.

This created a two-in-one vulnerability:

  1. A client-side validation gap (RouterOS).
  2. A server-side logic flaw (DC).

But the real danger was on the DC side, because that’s the ultimate source of truth.


Anything Using NTLM Could Be at Risk

Although I didn’t test every scenario, anything using NTLM appeared to be at risk. For example, it could be easily tested with a fresh Windows Server 2025 ISO (no patches) with a Domain Controller role, and IIS configured for NTLM authentication. If the same logic flaw applies, this vulnerability could extend beyond RADIUS and NPS into other services that rely on NTLM for authentication.


Post-Patch Observation

Since Microsoft released a patch (KB5050009), it appears that communication between the NPS server and the Domain Controller (DC) now uses Kerberos instead of the previous NTLMv2-based flow. This is a significant improvement because Kerberos provides mutual authentication and stronger cryptographic guarantees, reducing the risk of similar logic flaws in the future.


Impact

  • Any RADIUS client that doesn’t validate the MSCHAPv2 authenticator response can grant access with only a valid username. The password is irrelevant.
  • Enterprise environments using NPS for VPN, Wi-Fi, or network access control were at risk.
  • Potentially any service using NTLM could have been vulnerable if the DC logic flaw persisted.

Lessons Learned

  • Never assume protocol integrity checks are optional. RouterOS skipping the MSCHAPv2 response validation amplified the risk.
  • Trust boundaries matter. When the DC fails, everything downstream fails.
  • Test across versions and services. This bug wasn’t a one-off, it was systemic.
  • Protocol choice matters. Moving to Kerberos for NPS/DC communication was the right fix.

Links