use posterior::gpdfit and posterior::qgeneralized_pareto() by avehtari · Pull Request #305 · stan-dev/loo

avehtari · 2025-10-17T17:09:46Z

Since we do use posterior package elsewhere,
replace gpdfit() and qgpd() with posterior::gpdfit and posterior::qgeneralized_pareto()

This will make maintenance easier. For example, there is a PR for posterior::gpdfit() to make it more robust and posterior::qgeneralized_pareto() already is more robust than qgpd()

This is related to #249, but timing is due to the fix PR for posterior::gpdfit()

jgabry

I think the R cmd check error is because the NAMESPACE file is still trying to export gpdfit from loo. If you run devtools::document() it should fix this and then we can see if all the tests pass.

avehtari · 2025-10-17T18:08:55Z

If it has been exported, then someone might be using it? Should I put it back and define it with posterior::qgeneralized_pareto()?

avehtari · 2025-10-17T18:34:54Z

At least in github only I and Juho Timonen have used loo::gpdfit, so I guess it can be removed

jgabry · 2025-10-17T18:35:33Z

Good point. Yeah you can just replace the bodies of the functions with calls to the posterior versions.

jgabry · 2025-10-17T18:37:09Z

At least in github only I and Juho Timonen have used loo::gpdfit, so I guess it can be removed

OK in that case, we can try removing it and we will find out in the reverse dependency checks whether any packages are using it

jgabry · 2025-10-17T19:09:17Z

The error now is because apparently these functions aren't exported from posterior, they're just internal functions. That would be easy to change but it would require doing a CRAN release of posterior.

avehtari · 2025-10-17T19:29:25Z

my bad, I guess this needs to wait for the next posterior release

jgabry · 2025-10-17T19:35:21Z

There is also #290 from @VisruthSK that is waiting to be merged until there is a new release of posterior

Update use-posterior-gpdfit branch

codecov-commenter · 2026-04-07T17:40:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.75%. Comparing base (df56db4) to head (8d83fc2).
⚠️ Report is 13 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #305      +/-   ##
==========================================
+ Coverage   92.73%   92.75%   +0.02%     
==========================================
  Files          31       31              
  Lines        3057     2983      -74     
==========================================
- Hits         2835     2767      -68     
+ Misses        222      216       -6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jgabry · 2026-04-07T19:06:23Z

I think this needs to bump the posterior version in DESCRIPTION (unless it's merged after the other one that does that, but not sure which will be merged first)

VisruthSK · 2026-04-07T20:09:29Z

Right, in my head it was being merged after the previous one. Will fix that now.

jgabry · 2026-04-07T21:31:44Z

Right, in my head it was being merged after the previous one. Will fix that now.

We need to figure out what to do about the warnings in the other PR, so will probably do this one first. But for this one, it occurs to me that we need to reexport gpdfit from posterior in order to avoid breaking backwards compatibility (it's currently exported from loo so could be called as loo::gpfit in someone's code.

VisruthSK · 2026-04-07T21:53:51Z

Will restore tests too then

jgabry · 2026-04-07T22:00:08Z

Do you think we need to test it in addition to exporting it? I assume it's tested in posterior. Maybe good to test here too, but I'm not sure

VisruthSK · 2026-04-07T22:01:26Z

If we are rexporting it I think it makes to test, mainly so that the API is the same no? The tests were super minimal anyway.

jgabry · 2026-04-07T22:01:55Z

Ok yeah good call

This reverts commit 21bdd28.

…into use-posterior-gpdfit

jgabry · 2026-04-08T01:13:06Z

Aside from needing to reexport and add the tests back, this looks good.

jgabry · 2026-04-08T01:17:04Z

Since we had a snapshot test, adding the tests back should tell us whether the posterior version returns the exact same values as our previous loo version right? Or was the snapshot generated with this new version? We should be sure we're not getting meaningfully different results.

VisruthSK · 2026-04-08T04:25:41Z

Snaps aren't modified in this run, so if tests pass they're being checked against the originally generated ones.

jgabry

Looks good, thanks. Although I wish we had #348 set up to make sure the posterior versions aren't less efficient. For example, maybe they add overhead due to doing more type and argument checking that we skipped in the previous loo implementation?

VisruthSK · 2026-04-08T22:03:32Z

I mean this isn't an urgent PR right? I can setup the mechanics of touchstone if you'd like, but the real work is deciding what benchmarks to run, which will require more careful thought and discussion.

jgabry · 2026-04-08T22:16:53Z

Thanks, yeah not super urgent. I don't think the benchmarks need to be too complicated. We probably don't need to benchmark individual functions other than loo(), at least not right away (or maybe ever). Since we're just worried about efficiency here it doesn't matter if it's a realistic example that we use for the benchmark. Maybe codex and claude are good enough to suggest some that don't take too long to run but are long enough that we notice performance regressions? That's probably the tricky part here, right? We don't want CI to take forever but we want to notice real regressions. Or are there other considerations you can think of?

VisruthSK · 2026-04-08T22:50:42Z

Hm fair enough I guess we could just run the toplevel function. I suppose that's what I've been doing too--but it would be nice to get a little but of a profile by breaking down the call into some cogent sub-calls--i.e., if there's a PR which speeds up PSIS, it'd be nice to see that most things don't change but PSIS takes half the time or whatever.

jgabry · 2026-04-08T22:52:14Z

Yeah good point

VisruthSK · 2026-04-08T23:13:54Z

I guess we could shelve the finer touchstone tests for now and revist once we already have working pipeline--writing in #352.

Update posterior gpdfit

github-actions · 2026-04-14T21:56:56Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if cddb7cc is merged into master:

🚀loo_function: 1.99s -> 1.74s [-12.9%, -12.07%]
🚀loo_matrix: 1.53s -> 1.28s [-17.25%, -16.27%]
Further explanation regarding interpretation and methodology can be found in the documentation.

jgabry · 2026-04-14T22:18:41Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if cddb7cc is merged into master:

🚀loo_function: 1.99s -> 1.74s [-12.9%, -12.07%]

🚀loo_matrix: 1.53s -> 1.28s [-17.25%, -16.27%]
Further explanation regarding interpretation and methodology can be found in the documentation.

Ok so using the posterior versions makes it faster?

VisruthSK · 2026-04-15T00:44:41Z

Yup looks like it. We should probably increase the number of repetitions if we care about the actual reduction amount. Kinda odd to have confidence intervals in a Stan project...

jgabry · 2026-04-15T00:57:32Z

I agree we should increase the repetitions. We’ll forgive the author of the touchstone package for the lack of a Bayesian analysis ;)

VisruthSK · 2026-04-15T01:44:23Z

New package, touchstan?

Do you want to bump up the repeats and see how the numbers shake out, or do you think this is good to merge?

avehtari · 2026-04-15T06:00:48Z

Confidence interval is acceptable, but definitely there is not enough accuracy to justify more than two significant digits

1.53s -> 1.28s [-17.25%, -16.27%]

jgabry · 2026-04-15T18:15:16Z

Do you want to bump up the repeats and see how the numbers shake out, or do you think this is good to merge?

I would increase the repetitions if the code is pretty fast. But it's still taking ~1 hr (https://github.com/stan-dev/loo/actions/runs/24422791263/job/71349231024?pr=305). Is that because it's not caching the packages?

VisruthSK · 2026-04-15T20:10:19Z

Yes I think it's building a lot of packages from source. I think I know a way around that if reruns are slow. If it's back to the times we saw before, about 5 minutes for a whole run at this repetition count, what do you think would be a good number of reps? How slow is okay, 30 min?

jgabry · 2026-04-15T20:16:08Z

Yeah, I think 20-30 minutes is fine

Update posterior gpdfit branch to get newer touchstone functionality

github-actions · 2026-04-16T23:22:39Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 8d83fc2 is merged into master:

🚀loo_function: 2.07s -> 1.74s [-16.4%, -15.61%]
🚀loo_matrix: 1.57s -> 1.28s [-18.73%, -18.17%]
Further explanation regarding interpretation and methodology can be found in the documentation.

VisruthSK · 2026-04-17T00:59:50Z

@jgabry good to merge? This is with n=60 reps per branch. Slow touchstone action because I moved to new ubuntu and R version.

jgabry · 2026-04-17T14:35:10Z

Thanks, yeah I think this one is good to go. Merging now.

…pdfit" This reverts commit 85db294, reversing changes made to df56db4.

avehtari added 3 commits October 17, 2025 19:54

use posterior::gpdfit

9f9e9ad

use posterior::qgeneralized_pareto()g

05c69e1

delete gpd functions

aae6c17

avehtari requested a review from jgabry October 17, 2025 17:10

jgabry reviewed Oct 17, 2025

View reviewed changes

remove gpdfit doc and export

099398b

VisruthSK marked this pull request as draft February 21, 2026 21:30

VisruthSK added 2 commits April 7, 2026 10:27

Merge pull request #346 from stan-dev/master

8502999

Update use-posterior-gpdfit branch

Deleted gpdfit tests

21bdd28

VisruthSK marked this pull request as ready for review April 7, 2026 17:40

VisruthSK requested a review from jgabry April 7, 2026 17:41

Update DESCRIPTION

dce7e3c

VisruthSK added 2 commits April 7, 2026 18:10

Revert "Deleted gpdfit tests"

1b2678e

This reverts commit 21bdd28.

Merge branch 'use-posterior-gpdfit' of https://github.com/stan-dev/loo …

ac74d1e

…into use-posterior-gpdfit

jgabry mentioned this pull request Apr 8, 2026

Performance benchmarks for PRs? #348

Closed

Exported gpdfit, removed qgpd

ccb9893

jgabry approved these changes Apr 8, 2026

View reviewed changes

Merge pull request #354 from stan-dev/master

cddb7cc

Update posterior gpdfit

VisruthSK mentioned this pull request Apr 14, 2026

Use touchstone #352

Merged

Merge pull request #359 from stan-dev/master

8d83fc2

Update posterior gpdfit branch to get newer touchstone functionality

jgabry merged commit 85db294 into master Apr 17, 2026
8 checks passed

jgabry deleted the use-posterior-gpdfit branch April 17, 2026 14:35

vinniott added a commit to vinniott/loo that referenced this pull request May 10, 2026

Revert "Merge pull request stan-dev#305 from stan-dev/use-posterior-g…

9d199fb

…pdfit" This reverts commit 85db294, reversing changes made to df56db4.

Uh oh!

Conversation

avehtari commented Oct 17, 2025

Uh oh!

jgabry left a comment

Choose a reason for hiding this comment

Uh oh!

avehtari commented Oct 17, 2025

Uh oh!

avehtari commented Oct 17, 2025

Uh oh!

jgabry commented Oct 17, 2025

Uh oh!

jgabry commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jgabry commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

avehtari commented Oct 17, 2025

Uh oh!

jgabry commented Oct 17, 2025

Uh oh!

codecov-commenter commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jgabry commented Apr 7, 2026

Uh oh!

VisruthSK commented Apr 7, 2026

Uh oh!

jgabry commented Apr 7, 2026

Uh oh!

VisruthSK commented Apr 7, 2026

Uh oh!

jgabry commented Apr 7, 2026

Uh oh!

VisruthSK commented Apr 7, 2026

Uh oh!

jgabry commented Apr 7, 2026

Uh oh!

jgabry commented Apr 8, 2026

Uh oh!

jgabry commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VisruthSK commented Apr 8, 2026

Uh oh!

jgabry left a comment

Choose a reason for hiding this comment

Uh oh!

VisruthSK commented Apr 8, 2026

Uh oh!

jgabry commented Apr 8, 2026

Uh oh!

VisruthSK commented Apr 8, 2026

Uh oh!

jgabry commented Apr 8, 2026

Uh oh!

VisruthSK commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

jgabry commented Apr 14, 2026

Uh oh!

VisruthSK commented Apr 15, 2026

Uh oh!

jgabry commented Apr 15, 2026

Uh oh!

VisruthSK commented Apr 15, 2026

Uh oh!

avehtari commented Apr 15, 2026

Uh oh!

jgabry commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VisruthSK commented Apr 15, 2026

Uh oh!

jgabry commented Apr 15, 2026

jgabry commented Oct 17, 2025 •

edited

Loading

jgabry commented Oct 17, 2025 •

edited

Loading

codecov-commenter commented Apr 7, 2026 •

edited

Loading

jgabry commented Apr 8, 2026 •

edited

Loading

VisruthSK commented Apr 8, 2026 •

edited

Loading

jgabry commented Apr 15, 2026 •

edited

Loading