Changelog
Source:NEWS.md
leafwax 0.2.4
Documentation
- Add Zenodo concept DOI (
10.5281/zenodo.20172570) toCITATION.cffand surface it as a badge in the README. The package can now be cited as software in addition to the related manuscript. - Citation block in the README provides two BibTeX entries: a
@softwareentry for the Zenodo archive and an@unpublishedentry for the manuscript (the manuscript is in preparation).
leafwax 0.2.3
Documentation
- Add
CITATION.cffdescribing the package as software for the CFF tooling ecosystem and GitHub’s “Cite this repository” button. - Paleo vignette: note that
baseline_spis used in the worked example for simplicity and thatbaseline_env_spis the companion variant for matching detection-threshold figures elsewhere.
leafwax 0.2.2
Bug fixes (runtime correctness)
-
invert_d2H()andinvert_d2h(): the reported credible interval did not include the model’s posterior residual SDsigma, so the reportedprediction_interval_widthwas the wrong quantity for single-site reconstruction. At a typical site this gave intervals roughly an order of magnitude narrower than the manuscript’s per-site uncertainties. The wax-error draw now follows manuscript supplement Section S4.1 Eq. 7:analytical^2 + sigma_residual^2combined in quadrature, with parameter and spatial uncertainty carried through the per-iteration posterior draws. This applies uniformly to absolute single-point reconstruction and to within-record contrasts; the spatial GP intercept’s contribution cancels in any contrast computed from the returnedposterior_draws(manuscript Section 4.5.3). -
detect_change(): the threshold-formula argument is renamed fromsigma_withintosigma_residualto match the manuscript’s framework. Pass the model’s posteriorsigma(approximately 16 per mil for the spatial models). The function no longer requires the reconstruction to be built in any special mode. -
assess_claim(): dropped theclaim$sigma_withinfield. The L1 threshold uses analytical uncertainty alone (manuscript Section 4.5.3); L3 still uses the inversion’s posterior_draws but no longer requires a separate within-record SD. - Removed
estimate_sigma_within(). Within-record uncertainty in this framework comes from the calibration’ssigma_residualcombined with the spatial GP intercept’s cancellation in a contrast; there is no separate record-specific SD to estimate. -
invert_d2H_ensemble(): rewritten to fix multiple bugs in the default-args path. The previous defaultmodels =argument used v0.1 names that are not in the v10 registry; replaced withc("full_sp", "full_interact_sp", "elevation_c4_interact_sp"). Validation accessedavailable_models()$model, butavailable_models()returns a character vector, so the subset was alwaysNULLand validation silently skipped; now reads the vector directly. The innerinvert_d2H()call did not passreturn_full = TRUE, so the pooling step downstream read$posterior_drawsfrom a summary data frame and gotNULL; now forcesreturn_full = TRUE. The pooling step itself flattened the per-modelposterior_drawsmatrix across BOTH the draw and site axes, collapsing multi-site input to one scalar mean shared across all sites; now pools per-site, per-draw across models. The pool also previously concatenated all per-model draws and then sampled, which gave models with more draws (e.g. 1000-draw heavy tier vs. 100-draw preview tier) a proportionally larger share of the “equal” pool; now resamples each model to a uniformn_target / kdraws first and concatenates. Return shape changed (see Breaking changes below). -
compare_models(): defaultmodels =argument referenced v0.1 names; replaced withc("baseline", "baseline_sp", "full_sp"). Several latent failures also blocked the default invocation. Averbosepartial-match conflict when forwarding...topredict_d2h_precip()was producingformal argument "verbose" matched by multiple actual arguments; switched todo.call()with a filtered extra-args list. Single-row input made the per-model means a length-N vector withNULLdim(), tripping the row-wiseapply(); coerced to a 1xN matrix in that case. The per-model column rename forreturn_all = TRUEwas being applied unconditionally and broke the ensemble-summary path; now only applied on thereturn_all = TRUEpath. A user typo in...(e.g.verb = FALSE) used to cause every per-modeldo.call()to fail with “unused argument”, the per-modeltryCatchwould swallow each failure as a warning, and the function would abort with the misleading “All models failed”;compare_models()now validates...againstpredict_d2h_precip()’s formals up front and reports unknown argument names with a clear error before the model loop runs. Themodels_usedfield of the returned ensemble summary used to report the originally requested set rather than the models that actually contributed; now reportsnames(model_results)so partial-failure runs are not silently misreported. -
predict_d2h_precip()andinvert_d2H(): thec4_fraction * 100conversion was unconditional, so aNULLinput becamenumeric(0)and tripped a spurious capability-mismatch warning inside the core inversion. Both wrappers now keepNULLasNULL. They also now reject vector-length mismatches betweenc4_fractionandd2H_wax(was: silent R recycling with a generic warning) and validate that user inputs lie in[0, 1]. The out-of-range error names the offending maximum value and tells the caller to divide by 100 if the input is on the percent scale. -
get_data_manifest(): previously returnedlist(files = list())on download failure, which downstreamverify_data_integrity()read as “no checksums available, allow the file.” Now returnsNULLwith awarning();verify_data_integrity()treatsNULLas “verification skipped” and warns explicitly. -
check_data_cache(),list_cached_models(), andget_cache_files():download_model_data()writes a single posterior file per model atposteriors/<model>_posterior.rds, but these three helpers were still looking for the v0.1 layout (metadata/<model>_metadata.rds,posteriors/<model>_2000draws.rds, andposteriors_full/<model>_complete.rds). After any successful download they reported the model as absent. The three helpers now read the canonical v0.2 layout. Thedata_typeargument ofcheck_data_cache()is retained for API compatibility but is now a no-op. -
data_urls.json:base_url_latestpreviously pointed at themainbranch ofbradleylab/leafwax-datawhilemanifest_urlwas pinned tov1.0.1. Aligned both tov1.0.1so a future non-no-opverify_data_integrity()cannot trip on drift between latest-branch files and a v1.0.1 manifest. -
clear_download_cache()no longer creates the cache directory before checking that it exists. Replaced the defaultget_cache_dir()call withget_cache_dir(create = FALSE).
Bug fixes (post-review pass)
These fixes resolve issues surfaced by the pre-CRAN review pass.
-
invert_d2H_ensemble()aborted with “formal argument ‘return_full’ matched by multiple actual arguments” whenever the caller passedreturn_full(ormodel_name) through.... The wrapper now strips both names from...before the per-model loop. The function also now warns when models contribute unequal draw counts and pools each model to the median count rather than the first model’s count. -
process_sequential()referencedprocessing_timeoutside theif (progress) {...}block where it was assigned, throwing “object ‘processing_time’ not found” wheneverbatch_predict(..., progress = FALSE)was called. -
batch_predict()aborted with “numbers of columns of arguments do not match” when one chunk errored to the smaller fallback shape and another succeeded with the full schema. The combine step now uses a column-tolerant helper (.rbind_chunks) that pads missing columns with NA. -
detect_change()aborted on the samerbindmismatch when an emptytest_intervalwas paired withmagnitudes. The empty- interval branch now appends NA magnitude columns so the column set matches populated rows. - The “elevation” code path in
invert_d2h()was unreachable for every v10 model. None of the 14 fitted posteriors carrybeta_elevcolumns, somodel$elevationwas never populated and the function unconditionally hit a “knots not found” warning that silently dropped any user-supplied elevation. Removed the dead spline branch;has_elevationis now derived from the actual posterior columns and isFALSEfor every shipped v10 model. The “env” variants instead carry abeta_precipterm, exposed viametadata$has_precip. -
validate_inputs(): the validated PFT vectors were dropped from the returned list when the model used the v10has_vegetationflag (rather than the legacyhas_pft). The output now honours both names. -
download_with_progress(): the error handler calledclose()oncon_in/con_outunconditionally, raising a secondary error inside the handler when either had not yet been opened. The handler now only closes connections that were actually opened.
Repo cleanup
- Removed the unreachable lookup-table subsystem (
R/lookup_tables.R,R/lookup_integration.R, thepredict_spatial_mpp()deprecated stub, and theuse_lookupargument ofpredict_d2h_precip()). The path was never wired to a shipping data archive; spatial predictions now always go throughpredict_spatial_dual_gp()against the live posterior. - Removed
clear_data_cache(). Useclear_download_cache()instead; the two helpers were near-duplicates with no behavioural difference. - Removed dead exports:
batch_invert_d2h(),monitor_memory(),verify_data_integrity(),setup_leafwax_data(),select_best_model(),get_model_recommendations(). None had callers in the package and several relied on broken legacy paths. - Removed the misleading text-progress bar in
predict_d2h_precip(): it jumped from 0% to 100% in a single step regardless of work done, because the inversion runs in one pass with no per-iteration callback. Theprogressargument is retained but is now a no-op at this layer; chunked progress reporting is still driven bybatch_predict().
Internal cleanup
-
LEAFWAX_DEFAULTS(inR/zzz.R) is now the single source of truth forleafwax.*user options.leafwax_config()andleafwax_set_config()derive their option lists from it, sosuppress_preview_warning(and any future option) round-trips without manual list maintenance. - Capability flags are now derived from the actual posterior columns in
load_posteriors().model_compatibility.Rkeeps the name-based view for callers that need expected-schema info without loading the model; the two views agree on what each shipped v10 variant contains. -
get_cache_info()now classifies cache files using the v0.2 download layout (posteriors/<model>_posterior.rds,spatial_metadata/<model>_knots.rds,manifest.json). The previous regex matched only legacy v0.1 names, so the function silently classified every real cache entry as “other”. -
load_posteriors()now warns when it falls back to a freshly generated 125-knot Fibonacci sphere because the model’s knot file is missing. The substituted knots are not byte-identical to the v10 fit, so silent substitution is methods drift. -
invert_d2h()warns (was: a print statement) whenmodel$scalingis missing and the inversion uses the conservativePLACEHOLDER_SCALINGdefaults — those scales are not the v10 fitted ones and reconstructions will not match the published calibration. - Magic numbers (analytical default 3 per mil, 125 spatial knots, default C4 percent, default PFT split, placeholder scaling) are promoted to named constants in
R/constants.R.
Repo organization
- Untracked
leafwax.Rcheck/artifacts (the directory is now matched by*.Rcheck/in.gitignore). - Removed
README_EXAMPLES.md,test_direct.R,test_inversion.Rfrom the repo root — orphaned scratch artifacts already excluded from the tarball via.Rbuildignore. - Moved
PLAN_v0.2.0.mdandPLAN_v0.2.2.mdintodev-notes/. - Removed the archived v0.1.0 test directory (
tests/_archive_v0.1.0/). - Added
cran-comments.md. -
.Rbuildignorecleaned of stale entries (vignettes/_archive_v0.1.0,test_*.R,README_EXAMPLES.md,PLAN_v*.md— paths that no longer exist or are now covered by directory-level rules).
Regression tests covering these fixes are in tests/testthat/test-cleanup-v022.R.
Breaking changes
-
invert_d2H()reported intervals are now wider (the posterior predictive includes the residualsigma). The point estimate (d2h_precip_mean,d2h_precip_median) is unchanged;d2h_precip_sd,d2h_precip_lower,d2h_precip_upper, andprediction_interval_widthare wider. -
invert_d2H()andinvert_d2h(): removedsigma_withinandsigma_within_sdarguments. The function applies the calibration’ssigma_residualdirectly. -
detect_change(): renamed argumentsigma_withintosigma_residual. -
assess_claim(): removed requiredclaim$sigma_withinfield. -
estimate_sigma_within()is removed. -
invert_d2H_ensemble()return shape changed.posterior_drawsis now ann_draws x n_sitesmatrix (previously: a flat vector of lengthn_drawsfor single-site, silently corrupted for multi-site).ensemble_summaryis now a data frame with one row per site (previously: a list of scalars, only correct for single-site input). The list of pooled models is now exposed at the top level asmodels_used; the previousensemble_summary$models_usedis gone. Single-site code that reade$ensemble_summary$meancontinues to work — the data frame has one row, and$meanreturns its single value. Multi-site code is breaking by definition: callers were reading wrong numbers before this fix. - The exported data objects
model_metadata,mini_posteriors, andmini_lookup_tableare removed. They held v0.1 model names and were not used by any v10 code path; metadata is now exposed viaget_all_model_metadata()and posteriors viaload_posteriors()(with the lazy-load resolver). Users still callingdata(model_metadata)will see “data set not found” and should switch toget_all_model_metadata(). - The legacy v0.1 helpers
load_model_posteriors(),check_model_data(),use_example_data(), andget_model_size_estimate()are removed. New code should callload_posteriors(),check_data_cache(), and the lazy-load download path. The synthetic-data fallback insideuse_example_data()is also gone; missing posteriors now produce an explicit error instead of fabricated draws. - The internal
.download_model_data_v0_1()and its private helperget_download_files()are removed (the exporteddownload_model_data()inR/download_data.Ris unchanged). -
validate_inputs()’s defaultmodel_namechanged from"b0b1"to"baseline". The previous default was a v0.1 name not in the v10 registry, so calls that relied on the default would have errored inget_all_model_metadata()lookup; the new default is the closest v10 equivalent. -
compare_models()’s NULL-fallback model set changed from v0.1 names ("b0b1","b0b1_elev","b0b1_sp") to v10 ("baseline","baseline_sp","full_sp"). Same rationale: the old default was unreachable. - The lookup-table API is removed (
create_lookup_table(),use_lookup_if_available(),predict_spatial_mpp(),validate_lookup_table(),get_spatial_params(),cache_all_lookup_tables(),benchmark_lookup(),generate_global_grid(), plus theprint.leafwax_lookup_tablemethod). The path was never wired to a published data archive; no v0.2 caller used it. -
predict_d2h_precip(): removed theuse_lookupargument (no-op since the lookup-table subsystem is gone). -
clear_data_cache()removed; useclear_download_cache(). -
batch_invert_d2h(),monitor_memory(),verify_data_integrity(),setup_leafwax_data(),select_best_model(), andget_model_recommendations()are removed.
Documentation and naming-drift cleanup
- All exported
\dontrun{}and runnable examples now reference v10 model names (baseline,baseline_sp,baseline_env, …) instead of the v0.1b0b1_*names. The v0.1 names are not available in the v10 model registry; the prior examples would have produced “model not found” errors if a user tried to run them. -
R/leafwax-package.Rrunnable example now passesmodel_name =rather than relying on R’s partial-argument matching ofmodel =. -
curlGetHeadersqualified asbase::curlGetHeadersat the call site (it is a base function, not autilsexport). -
get_url_config()fallback substitutes the realbradleylabGitHub organization instead of a[YOUR-USERNAME]placeholder. The fallback only fires for broken installs wheredata_urls.jsonis missing frominst/extdata/. -
inst/extdata/model_info.jsondescription forc4_fractionrewritten to match the actual contract (fraction[0, 1]on the public API, converted to percent internally). - Drafting-history breadcrumbs (“Phase A”, “Phase B”, “Codex P2 on Phase B”) are removed from in-source comments. Substantive content is preserved.
-
inst/examples/(four v0.1 example scripts) anddata-raw/{copy_posteriors,prepare_external_data,prepare_external_data_quick,prepare_package_data,_legacy_extract_spatial_metadata_120knot}.R-
data-raw/upload_instructions.md+data-raw/README.mdremoved. The v10 vignettepaleo-record-workflow.Rmdis now the canonical end-to-end example.
-
-
README.mdrewritten to describe the lazy-load architecture:inst/extdata/posteriors_light/ships in the tarball and full posteriors are downloaded frombradleylab/leafwax-datav1.0.1 (Zenodo) on first use, instead of the prior text claiming the package shipped ~10 MB of posteriors directly. -
.Rbuildignorewidened from^PLAN_v0\.2\.0\.md$to^PLAN_v.*\.md$so future PLAN files are auto-excluded; the now-obsolete^inst/examples$line is removed. -
_pkgdown.ymlreference index pruned: the deleted v0.1 helpers, the three deleted data exports, and the “legacy” framing on the lower-level helpers section are removed. -
tests/testthat/helper-data.Rpruned to just theleafwax.suppress_preview_warningoption setter; theb0b1-default mock helpers (create_mock_metadata,create_mock_posteriors,create_mock_lookup_table,create_test_data,skip_on_cran_and_ci,model_available) were dead code, not called by any test file.
Tests
- New regression test file
tests/testthat/test-cleanup-v022.Rlocks in: ensemble runs on default args;compare_models()runs on default args (single site, multiple sites,return_all = TRUE);c4_fractionround-trip frominvert_d2H()toinvert_d2h()produces consistent reconstructions; out-of-rangec4_fractionis rejected at both wrapper entry points;get_data_manifest()returnsNULLon download failure rather than a silently-empty list.
leafwax 0.2.1
CRAN preparation
- Three-tier posterior resolver wired up:
load_posteriors()andavailable_models()look in heavy posteriors → user cache → preview fixture, in that order. Heavy posteriors are now excluded from the built tarball (~11 MB → ~1.6 MB). -
inst/extdata/posteriors_light/is regenerated as a true 100-draw stratified subsample of the heavy posteriors with the full column set (the prior version dropped per-knot z columns and silently broke spatial inversion). The script that produces it is atdata-raw/regenerate_posteriors_light.R. - The preview tier is treated as a fixture:
load_posteriors(),invert_d2H(),assess_claim(), anddetect_change()warn loudly whenever it is in use, naming the function context and the actual draw count after thinning. Setoptions(leafwax.suppress_preview_warning = TRUE)to silence the warning in batch jobs that have already acknowledged the limitation. -
download_model_data()now writes<model>_posterior.rds(singular) to match whatload_posteriors()reads, and points at the bradleylab/leafwax-data archive (Zenodo DOI 10.5281/zenodo.20085465; v1.0.1 version DOI 10.5281/zenodo.20086180). Per-tag raw GitHub URLs are used for direct downloads; Zenodo holds the durable archive. -
jsonlitemoved fromSuggeststoImports(used unconditionally). -
DESCRIPTIONTitlereworded to drop thed2Habbreviation. - Internal helpers (
generate_fibonacci_sphere(),predict_spatial_dual_gp(),predict_spatial_mpp(), the four math primitives inspatial_interpolation.R, the legacy v0.1load_model_posteriors()) are now flagged as internal in the help index.
leafwax 0.2.0
Major release: paleo-record workflow + v10 calibration
leafwax 0.2.0 makes the package the operational backend for the manuscript “Spatial modeling improves the calibration of leaf wax hydrogen isotopes to precipitation” (Bradley, Geochimica et Cosmochimica Acta). The package now ships the v10 posterior draws for the 14 hierarchical Bayesian models reported in the manuscript and exposes the four-phase paleo workflow that the manuscript references in Sections 4.5.3, 4.5.5, and 4.5.6.
New functions
-
estimate_sigma_within(d2h_wax, age, baseline_interval, detrend, ar1_correction)– estimate the within-record residual SD on a stationarity-defended baseline interval. Manuscript Section 4.5.3. -
local_effective_slope(longitude, latitude, model_name, override, ceiling = 0.88, n_draws)– per-draw local slope at a site, with the simple-model ceiling from manuscript Section 4.5.5. -
estimate_temporal_autocorrelation(d2h_wax, age, method)– lag-1 autocorrelation for the within-record detection threshold. -
detect_change(reconstruction, age, baseline_interval, test_intervals, sigma_within, sigma_analytical, rho_t, beta_eff, confidence, magnitudes)– within-record d2H_precip detection threshold and posterior probability of change. -
assess_claim(record, claim, reconstruction, ...)– walks the four-level taxonomy from manuscript Section 4.5.6 and returns the highest level a claim survives at, with itemized pass/fail reasons.
invert_d2H()
- New args
sigma_within,sigma_within_sd,record_id,slope. -
sigma_withinenters in leaf-wax per-mil units and propagates throughbeta_oipc_efflike the measurement uncertainty (combined in quadrature in standardized wax space before inversion). -
record_idvalidates that all input rows are from one site and flags coordinate inconsistency under a constant identifier. -
slopeaccepts NULL (model slope), a single point estimate, or a per-draw vector; rejects zero / negative / non-finite values. - The exported wrapper now forwards
return_full,credible_level, andverbose.
Routing layer
-
available_models()exposes the 14 v10 model names from the manuscript:baseline,baseline_sp,baseline_env,baseline_env_sp,baseline_veg,baseline_veg_sp,full,full_sp,full_interact,full_interact_sp,elevation_only_sp,elevation_c4_sp,elevation_c4_interact_sp,c4_only_sp. -
load_posteriors()derives capability flags (has_c4,has_pft,has_elevation,has_gp,has_interaction) from posterior column names rather than name regexes;full,full_sp,full_interact, andfull_interact_spcorrectly report their vegetation and interaction effects. -
load_posteriors()subsamples deterministically (stratified thinning), so two calls with the samemodel_nameandn_drawsreturn the same draws subset. This is what letslocal_effective_slope(..., n_draws = N)pair by position withinvert_d2H(..., n_posterior_draws = N, slope = ...).
Data
- All 14 v10 model posterior draws shipped as
inst/extdata/posteriors/<model>_posterior.rds. - 125-knot Fibonacci-sphere knot files for the 9 spatial models shipped as
inst/extdata/spatial_metadata/<model>_knots.rds. - Standardisation parameters shipped as
inst/extdata/scaling_params.rds. - Lake Malawi LS11KOMA Common-Era leaf-wax record bundled at
inst/extdata/example_records/LS11KOMA_d2H.csvfor the vignette.
Vignettes
- New
paleo-record-workflow.Rmd: end-to-end seven-step workflow on the Lake Malawi record (load -> sigma_within -> slope -> inversion -> detect_change -> assess_claim -> plot). - The three v0.1.0 vignettes (Getting Started, Advanced Usage, Model Descriptions) are archived under
vignettes/_archive_v0.1.0/and excluded from the build pending a v0.3 rewrite.
Tests
- New testthat suites for the v10 routing (
test-v10-posteriors.R) and each phase of the paleo workflow (test-phase-{a,b,c,d}.R). - 275 PASS / 0 FAIL / 0 WARN.
- The four v0.1.0 testthat files referencing legacy model names and
invert_d2h()signatures are archived undertests/_archive_v0.1.0/for reference.