Problems with `find_degrees_of_freedom` for the noncentral t distribution.

(Note: in the following, I'm going to use the mnemonic names `df` and `nc` for degrees of freedom and noncentrality, respectively.  The Greek letters aren't used consistently in the various references I've been using, and I got tired of having to remind myself which symbol is which parameter.)

These are some notes about the code that attempts to invert the CDF of the noncentral t distribution with respect to the degrees of freedom.  The code was added in https://github.com/boostorg/math/pull/1355 (or more accurately, reenabled).  The old comment said "This code is disabled, since there can be multiple answers to the question, and it's not clear how to find the 'right' one."  As noted in that comment, there are cases where the function being inverted is not monotone; in particular the function is not monotone when `x` and `nc` have the same sign and `|nc|` is sufficiently large.

The limits as `df -> 0` and `df -> inf` of the noncentral t CDF are `Phi(-nc)` and `Phi(x - nc)`, respectively, where `Phi(z)` is the CDF of the standard normal distribution.  So in the monotone case, these value determine the domain of the inverse.  Outside of these values, the inverse function should probably trigger a domain error.  Even if the function is not monotone, one of the bounds is still correct.

The "problems" referred to in the issue title are:
* The solver typically fails in the input regions where the inverse is multivalued.
* The solver doesn't trigger domain errors in regions where there is no inverse. For example, with `nc = -2.5`, `x = 1.25` and `p = 0.99`, `find_degrees_of_freedom(nc, x, p)` returns `5.30499e-313`, when instead it should trigger a domain error. (See the fourth plot below.)

It would be useful to investigate:
(1) Under what conditions are we assured that the function is monotone?  In this case, we know the domain of the inverse, so inputs to the inverse that are outside the domain can trigger a domain error.
(2) Can we be smarter in the non-monotone case? How hard would it be to ensure that the return value is always the lower branch (i.e. the smaller `df`), and the solver succeeds on the full extent of this branch?

Currently, the solver typically fails when the inverse is multivalued.  Running a grid of `p` inputs shows that the solver will find some solutions a small way along the lower branch, but then starts failing as `p` moves further into the region of multivalued results.

<strike>Getting (2) working reliably might be a lot of work.  An alternative would be to *define* the behavior such that the domain of the inverse is restricted to the region where the inverse is single valued. (This is almost what is happening now anyway, since the solver often fails in the multivalued region.)  This would be easy to implement, because we know the domain  in that case.  Of course, this is a bit of a cop-out, but it would provide a well-defined and unambiguous behavior.</strike> The example added in a follow-up comment makes this idea less useful.

-----

Here are plots of the function to be inverted that include various combinations of `x` and `nc`.  The first and last plots show examples where the function is not monotone.  For all of them the `df` axis uses log scaling.  For the last plot, the vertical axis also uses a log-scale.  The label of the vertical axis, `nctdtr(df, nc, x)`, uses the (somewhat cryptic) SciPy name of the CDF of the noncentral t distribution.

The first three plots have `x < 0`:

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/863a45cf-392a-417e-937f-655becf3e286" />

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/04a6a39c-b942-40ab-baf9-55db502f3471" />

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/b1076ad7-20a6-4368-a942-a70b5f6f54af" />

The next three have `x > 0`:

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/1e5836e8-b60b-46c2-9930-d986f033b318" />

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/88ad66a6-21ce-4e5d-8268-5e528ea83a6b" />

This one uses a log-scale on the vertical axis to make clearer the behavior where `df` is very small:

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/3636239b-b2c2-4332-a5e8-7bfd3ff38644" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with `find_degrees_of_freedom` for the noncentral t distribution. #1410

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Problems with find_degrees_of_freedom for the noncentral t distribution. #1410

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Problems with `find_degrees_of_freedom` for the noncentral t distribution. #1410