Quantifying Uncertainty in Long-Run Default Probability Estimation

In the context of the Internal Ratings-Based (IRB) approach used by banks to estimate credit risk, the accurate quantification of the statistical uncertainty of long-run probabilities of default (PD) is essential. The European Banking Authority (EBA) mandates that institutions incorporate a Margin of Conservatism (MoC) into their estimates to account for uncertainty, particularly under conditions of limited data and correlated defaults. This article by Dominik Scherer and Newton K. Lenkana presents and evaluates three common methods for estimating such statistical uncertainty, emphasizing their strengths, limitations, and regulatory implications.

The authors identify three types of quantification methods commonly employed in practice:

Type I: Distribution-Based Approaches: These rely on assumptions about the distribution of defaults (e.g., binomial or Vasicek models). They are straightforward and computationally efficient but require knowledge of correlation parameters, which introduces model dependency.
Type II: Empirical Variance-Based Approaches: These use the sample variance of historical default rates. They are model-independent but vulnerable to volatility, especially in the presence of correlated defaults and short time series.
Type III: Bootstrapping Techniques: These generate new simulated PD series via resampling from observed defaults. This method is also model-free and produces positive confidence intervals by design, but it is computationally intensive and does not necessarily improve accuracy in sparse data environments.

Theoretical Framework and Simulation Setup

All three approaches are evaluated using simulations based on the Asymptotic Single Risk Factor (ASRF) model, which forms the mathematical foundation of Basel II/III regulations. The model simulates portfolios with varying:

Default probabilities (p): 0.1%, 1%, 10%
Correlation levels (ρ): 0 to 0.2
Time series length (T): 10 to 200 observation periods
Portfolio size: 5,000 obligors

Simulations (5,000 runs per scenario) and bootstrap iterations (1,000 per run) are used to test confidence interval (CI) coverage and compute MoC values under different scenarios.

Findings on Margin of Conservatism (MoC)

The analysis of the Margin of Conservatism (MoC) reveals clear and consistent patterns across all three estimation approaches. The MoC serves as a buffer added to the long-run average default rate (LRADR) to account for statistical uncertainty in probability of default (PD) estimates. The simulations show that the MoC increases significantly with rising asset correlation (ρ) and shorter time series lengths (T). This effect is particularly pronounced in low-default portfolios with PD levels around 0.1%, where the required observation periods to achieve reliable estimates would be unrealistically long under practical constraints – especially if defaults are measured annually.

Among the three approaches, the distribution-based Type I consistently yields the most conservative MoC values. This is especially evident in portfolios with low default rates and high correlation among obligors. However, this approach relies on model assumptions, particularly regarding the default correlation parameter, which must either be estimated or set conservatively. Type II approaches, based on empirical variance, produce lower MoCs on average but are more susceptible to volatility in the presence of systemic shocks and require significantly longer time series to achieve similar reliability. Bootstrapping (Type III) was expected to improve coverage and stability but did not show meaningful advantages over Type II. While it guarantees strictly positive confidence intervals, it does not resolve the core issue of undercoverage in highly correlated, low-data scenarios.

In summary:

Type I approaches provide the highest and most stable MoC levels but depend on model parameters.
Type II and III methods are model-free but yield less conservative and more volatile results under adverse conditions.
The differences between methods are small when defaults are uncorrelated, but become substantial as correlation and data scarcity increase.

These findings underscore the trade-off between methodological transparency and statistical robustness in PD estimation and suggest that model-based approaches may be preferable when reliability is critical.

Coverage Ratio Evaluation

Despite aiming for a 90% confidence level, the actual CI coverage ratios fell short – especially for Type II and III methods in scenarios of high correlation and short time series. Only Type I maintained robust coverage close to the nominal level, albeit under the assumption of known correlation inputs.

Regulatory Implications

In the current regulatory landscape shaped by the EBA and ECB, there is no explicit preference among the three estimation types. However, the ECB has emphasized the importance of length and statistical treatment of the time series. The authors advocate for conservative distribution-based methods (Type I) when empirical methods become unreliable due to limited data or systemic dependencies.

Approach	Advantages	Disadvantages
Type I	Accurate CIs, low effort	Requires correlation input, model-dependent
Type II	Model-free, simple	Unstable CIs in high correlation, negative bounds possible
Type III	Model-free, positive bounds	High effort, no accuracy gain over Type II

Table 01: Advantages and Disadvantages of Each Method

Conclusion

Estimating long-run PD with high statistical confidence remains a significant challenge, especially in low-default, highly correlated portfolios with short observation periods. While empirical and bootstrapping approaches may suffice under ideal conditions, their reliability diminishes rapidly when correlation increases. Distribution-based methods, particularly those leveraging the ASRF model with conservative assumptions, offer a more resilient path forward. The authors recommend their use, especially under regulatory constraints and in risk-sensitive applications, as they provide more stable estimates even in the face of sparse or noisy data.

Dominik Scherer | Newton K. Lenkana (2025): Anatomy of Estimation Approaches for the Statistical Uncertainty of a Long-Run Probability of Default

Download

[ Source of cover photo: Generated with AI ]

A Comparative Analysis of Statistical Approaches for PD Estimation under Regulatory Constraints

Quantifying Uncertainty in Long-Run Default Probability Estimation