Package 'RoundingMatters' reference manual

Title:	Tools for adjusting for rounding problems in metastudies about p-hacking and publication bias
Description:	Tools for adjusting for rounding problems in metastudies about p-hacking and publication bias
Authors:	Sebastian Kranz, Peter Puetz
Maintainer:	Sebastian Kranz <[email protected]>
License:	GPL (>= 2.0)
Version:	0.1.0
Built:	2025-01-28 02:37:40 UTC
Source:	https://github.com/skranz/RoundingMatters

Density estimates for absolute z-statistics assuming that z-statistics are symmetrically distributed around 0

Description

Avoids downward bias at the left hand side where abs(z)=0.

Usage

absz.density(
  z,
  at = NULL,
  bw = 0.1,
  adjust = 1,
  kernel = "epanechnikov",
  n = 1024,
  weights = NULL,
  ...
)
absz.density(
  z,
  at = NULL,
  bw = 0.1,
  adjust = 1,
  kernel = "epanechnikov",
  n = 1024,
  weights = NULL,
  ...
)

Arguments

`z`	vector of z-statistics (or absolute z-statistics)
`at`	vector of points where density shall be evaluated. If NULL return a function (by calling `approxfun`) that allows evaluate the density at arbitrary points.
`bw`, `adjust`, `kernel`, `n`, `weights`, `...`	arguments passed to `stats::density`

Perform kernel estimates of two densities of absolute z-statistics and their ratio.

Description

Add by default bootstrap standard errors and confidence intervals

Usage

absz.density.ratio(
  z.num,
  z.denom,
  at,
  bootstrap = TRUE,
  B = 1000,
  ci.level = 0.95,
  bw = 0.1,
  kernel = "epanechnikov",
  return.as = c("long", "wide")[1],
  weights.num = NULL,
  weights.denom = NULL,
  ...
)
absz.density.ratio(
  z.num,
  z.denom,
  at,
  bootstrap = TRUE,
  B = 1000,
  ci.level = 0.95,
  bw = 0.1,
  kernel = "epanechnikov",
  return.as = c("long", "wide")[1],
  weights.num = NULL,
  weights.denom = NULL,
  ...
)

Arguments

`z.num`	observed z-statistics forming numerator density
`z.denom`	observed z-statistics forming denominator density
`at`	position where density shall be evaluated
`bootstrap`	if TRUE add bootstrap SE and CI for all measures
`B`	number of bootstrap repetitions
`ci.level`	Confidence level. Default 0.95.
`weights.num`	weights for z.num (optional)
`weights.denom`	weights for z.denom (optional)
`...`	arguments for absz.density

Convert numbers like 0.421 to 42.1%

Description

Convert numbers like 0.421 to 42.1%

Usage

as.perc(x, digits = 1)
as.perc(x, digits = 1)

Arguments

`x`	a vector of floating point numbers
`digits`	to how many decimal digits shall the percentage be rounded?

Draw derounded z assuming missing digits of mu and sigma are uniformly distributed, but adjust for estimated density of z using rejection sampling

Description

Draw derounded z assuming missing digits of mu and sigma are uniformly distributed, but adjust for estimated density of z using rejection sampling

Usage

deround.z.density.adjust(
  z.pdf,
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec,
  max.rejection.rounds = 10000,
  verbose = TRUE,
  just.uniform = rep(FALSE, length(mu)),
  z.min = 0,
  z.max = 5
)
deround.z.density.adjust(
  z.pdf,
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec,
  max.rejection.rounds = 10000,
  verbose = TRUE,
  just.uniform = rep(FALSE, length(mu)),
  z.min = 0,
  z.max = 5
)

Arguments

`z.pdf`	An estimated density of the derounded z-statistics (e.g. using only observations with many significant digits) normalized such that its highest values is 1. Best use `make.z.pdf` to create such a normalized pdf from a vector of observed z-statistics.
`mu`	Reported coefficient, possibly rounded
`sigma`	Reported standard error, possibly rounded.
`mu.dec`	Number of decimal places mu is reported to. Usually, we would assume that mu and sigma are rounded to the same number of decimal places. Since trailing zeros may not be detected, we set the default `mu.dec=pmax(num.deci(mu),num.deci(sigma))`.
`sigma.dec`	By default equal to mu.dec.
`max.rejection.rounds`	A limit how often the rejection sampler redraws to avoid an infinite loop.
`verbose`	If `TRUE` cat an r for each resampling draw to see how the function progresses.

Draw derounded z assuming missing digits of mu and sigma are uniformly distributed

Description

Draw derounded z assuming missing digits of mu and sigma are uniformly distributed

Usage

deround.z.uniform(
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec
)
deround.z.uniform(
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec
)

Arguments

`mu`	Reported coefficient, possibly rounded
`sigma`	Reported standard error, possibly rounded.
`mu.dec`	Number of decimal places mu is reported to. Usually, we would assume that mu and sigma are rounded to the same number of decimal places. Since trailing zeros may not be detected, we set the default `mu.dec=pmax(num.deci(mu),num.deci(sigma))`.
`sigma.dec`	By default equal to mu.dec.

Create an ab.df for the dsr approach

Description

The resulting data frame is required for derounding b simulting rounding (dsr) approach. It contains a row for all considered combinations of z and s and window half-width h in h.seq. The columns share.below and share.above indicate which share of derounded z-statistics are inside the window and either fall below or above the threshold z0, respectively. Note that 1-share.above-share.below is the share of derounded z-statistics that fall outside the considered window.

Usage

dsr.ab.df(
  dat,
  h.seq = c(0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5),
  z0 = 1.96,
  min.n = 10000,
  min.rounds = 5,
  verbose = TRUE
)
dsr.ab.df(
  dat,
  h.seq = c(0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5),
  z0 = 1.96,
  min.n = 10000,
  min.rounds = 5,
  verbose = TRUE
)

Arguments

`dat`	the data frame that should contain at least the columns z and num.deci (number fo decimal places of mu and sigma, maximum of both)
`h.seq`	all considered window half-widths
`z0`	the significance threshold. Can be a single number or a vector with one element per row of dat.
`min.n`	how many z values shall be minimally rounded to compute the derounded z-distribution for each observation.
`min.rounds`	how many repetitions of rounding z-values shall there be at least (even if min.n is already reached).
`verbose`	Shall some progress information be shown? (This function can take a while).

Finds observations in dat for which we shall perform dsr adjustment

Description

Adds to dat the logical columns dsr.adjust and dsr.compute. dsr.adjust==TRUE means that z-statistics of this observation will be adjusted by dsr. The adjustment statistics only depend on the reported z value and significant s of sigma. We thus don't need to compute the distribution for all rows with dsr.adjust==TRUE. If dsr.compute==TRUE we shall cmpute the derounded distribution for this observations.

Usage

dsr.mark.obs(
  dat,
  h.seq = c(0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5),
  z0 = 1.96,
  s.max = 100,
  no.deround = NULL
)
dsr.mark.obs(
  dat,
  h.seq = c(0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5),
  z0 = 1.96,
  s.max = 100,
  no.deround = NULL
)

Arguments

`dat`	the data frame, must have columns `z` and `s`
`h.seq`	vector of considered windows half-width. We mark an observation for adjustment if it is at risk of missclassification, wrong inclusion, or wrong exclusion for any considered window size.
`z0`	the signficance threshold for z (default=1.96).
`s.max`	only mark observations for adjustment who have `s <= s.max`. Default value is 100.
`no.deround`	a logical vector indicating columns that shall never be derounded

Compute a normalized pdf from a vector of z-statistics

Description

The PDF is normalized such that the point of highest density is 1

Usage

make.z.pdf(
  z,
  bw = 0.05,
  kernel = "gaussian",
  n = 512,
  dat,
  min.s = 100,
  z.min = 0,
  z.max = 5,
  show.hist = FALSE,
  ...
)
make.z.pdf(
  z,
  bw = 0.05,
  kernel = "gaussian",
  n = 512,
  dat,
  min.s = 100,
  z.min = 0,
  z.max = 5,
  show.hist = FALSE,
  ...
)

Arguments

`z`	a vector of z statistics. Usually, you would select all values from dat whose mu and sigma have sufficiently many significant digits
`...`	other parameters passed to `stats::density`

Compute minimum and maximum possible values of z given rounded mu and sigma

Description

Compute minimum and maximum possible values of z given rounded mu and sigma

Usage

## S3 method for class 'max.z'
min(
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec
)
## S3 method for class 'max.z'
min(
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec
)

Arguments

`mu`	Vector of reported estimated coefficients
`sigma`	Vector of reported standard errors
`mu.dec`	Number of reported decimal digits for mu. By default the maximum of the
`long`	if TRUE (default return results in a long format)

Get the number of significand digits of a floating point number using the character presentation of those numbers of R

Description

Get the number of significand digits of a floating point number using the character presentation of those numbers of R

Usage

num.deci(x)
num.deci(x)

Arguments

`x`	a numeric vector

Get the number of significand digits of a floating point number using the character presentation of those numbers of R

Description

We assume that trailing zeros left of the decimal point are significant digits while trailing zeros right of the decimal point are not significant digits

Usage

num.sig.digits(x)
num.sig.digits(x)

Arguments

`x`	a numeric vector

Get the last significant digit(s) of a floating point number

Description

Get the last significant digit(s) of a floating point number

Usage

rightmost.sig.digit(x, r1 = 1, r2 = 1)
rightmost.sig.digit(x, r1 = 1, r2 = 1)

Arguments

`x`	The vector of floating point numbers
`r1`	Starting position from right
`r2`	Ending position from right

Compute thresholds for the significant s of the reported standard deviation such that we can rule-out the errors: misclassification, wrong inclusion, wrong exclusion

Description

Compute thresholds for the significant s of the reported standard deviation such that we can rule-out the errors: misclassification, wrong inclusion, wrong exclusion

Usage

rounding.risk.s.thresholds(z, z0 = z0, h = 0.2)
rounding.risk.s.thresholds(z, z0 = z0, h = 0.2)

Arguments

`z`	a vector of z statistics
`z0`	significance threshold. Can be a single number or a vector of length z
`h`	half-width of considered window around z0

Value

A data frame with the columns "z", "s.misclass", "s.include", "s.exclude" specifying for each z value the corresponding thresholds.

Assess for observations with reported z-statistic z and a signficand of s for the standard error whether it is at risk of the errors: misclassification, wrong inclusion, wrong exclusion

Description

Assess for observations with reported z-statistic z and a signficand of s for the standard error whether it is at risk of the errors: misclassification, wrong inclusion, wrong exclusion

Usage

rounding.risks(z, s, z0 = 1.96, h = 0.2)
rounding.risks(z, s, z0 = 1.96, h = 0.2)

Arguments

`z`	a vector of z statistics
`s`	vector of corresponding significands of the standard error
`z0`	significance threshold. Can be a single number like 1.96 or a vector of length z
`h`	half-width of considered window around z0

Value

A data frame with risk of missclassification information for each observations. We illustrate the columns for the misclassification risk: "s.misclass" is the threshold for the significand s above which we can rule out misclassification risk risk.misclass = s < s.misclass indicates whether the observation is at risk of misclassification risk.misclass.below = risk.misclass & z < z0 indicates whether the observation is at risk of misclassification and below the significance threshold the other columns should be self-explainable given this info.

Summary statistics for rounding risks for different thresholds

Description

Summary statistics for rounding risks for different thresholds

Usage

rounding.risks.summary(rr.dat, s.thresh = 0:100, long = TRUE)
rounding.risks.summary(rr.dat, s.thresh = 0:100, long = TRUE)

Arguments

`rr.dat`	A data frame returned from a call to `rounding.risks`.
`long`	if TRUE (default return results in a long format)
`s.tresh`	a vector of considered s thresholds

Sample derounded z from the uniformely derounded distributon for a given single value of mu and sigma

Description

Sample derounded z from the uniformely derounded distributon for a given single value of mu and sigma

Usage

sample.uniform.z.deround(
  n,
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec
)
sample.uniform.z.deround(
  n,
  mu,
  sigma,
  mu.dec = pmax(num.deci(mu), num.deci(sigma)),
  sigma.dec = mu.dec
)

Arguments

`n`	Number of sample draws
`mu`	Reported coefficient, possibly rounded
`sigma`	Reported standard error, possibly rounded.
`mu.dec`	Number of decimal places mu is reported to. Usually, we would assume that mu and sigma are rounded to the same number of decimal places. Since trailing zeros may not be detected, we set the default `mu.dec=pmax(num.deci(mu.round),num.deci(sigma.round))`.
`sigma.dec`	By default equal to mu.dec.

Sets the last digit of a number x to zero

Description

Sets the last digit of a number x to zero

Usage

set.last.digit.zero(x)
set.last.digit.zero(x)

Arguments

`x`	a numeric vector

Get the significands of a numeric vector using the character presentation of those numbers of R

Description

The significand is the integer of all significand digits, e.g. the significand of 0.012 is 12.

Usage

significand(x, num.deci = NULL)
significand(x, num.deci = NULL)

Arguments

`x`	a numeric vector.
`num.deci`	If not NULL a vector that states the number reported decimal places for x. This can be used if we know that there were addtional trailing zeros.

ggplot2 density lines for absolute z-statistics assuming that they are symmetrically distributed around 0

Description

Unlike normal [geom_density] or [stat_density] the density estimate does not go artificially decrease at the left bound 0. Note that this function only works nicely if the data starts left with 0. Possibly atoms at z=0 should ideally be removed.

Usage

stat_abszdensity(
  mapping = NULL,
  data = NULL,
  geom = "line",
  position = "stack",
  ...,
  bw = "nrd0",
  adjust = 1,
  kernel = "epanechnikov",
  n = 512,
  trim = FALSE,
  na.rm = FALSE,
  orientation = NA,
  show.legend = NA,
  inherit.aes = TRUE
)
stat_abszdensity(
  mapping = NULL,
  data = NULL,
  geom = "line",
  position = "stack",
  ...,
  bw = "nrd0",
  adjust = 1,
  kernel = "epanechnikov",
  n = 512,
  trim = FALSE,
  na.rm = FALSE,
  orientation = NA,
  show.legend = NA,
  inherit.aes = TRUE
)

Arguments

`bw`	The smoothing bandwidth to be used. If numeric, the standard deviation of the smoothing kernel. If character, a rule to choose the bandwidth, as listed in [stats::bw.nrd()].
`adjust`	A multiplicate bandwidth adjustment. This makes it possible to adjust the bandwidth while still using the a bandwidth estimator. For example, 'adjust = 1/2' means use half of the default bandwidth.
`kernel`	Kernel. See list of available kernels in [density()].
`n`	number of equally spaced points at which the density is to be estimated, should be a power of two, see [density()] for details
`trim`	If 'FALSE', the default, each density is computed on the full range of the data. If 'TRUE', each density is computed over the range of that group: this typically means the estimated x values will not line-up, and hence you won't be able to stack density values. This parameter only matters if you are displaying multiple densities in one plot or if you are manually adjusting the scale limits.

Computed variables

density: density estimate
count: density * number of points - useful for stacked density plots
scaled: density estimate, scaled to maximum of 1
ndensity: alias for 'scaled', to mirror the syntax of ['stat_bin()']

Analysis with derounded z-statistics for different window half-widths around z0

Description

This is the main function you will call if you want to perform a publication bias / p-hacking analysis with derounded z-statistics. It allows flexible combinations of how a single derounded z vector is drawn, which statistics are computed for each combination of window h and derounded z-draw and how those statistics are aggregated over multiple replications.

Usage

study.with.derounding(
  dat,
  h.seq = c(0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5),
  window.fun = window.t.ci,
  mode = c("reported", "uniform", "zda", "dsr")[1],
  alt.mode = c("uniform", "reported")[1],
  make.z.fun = NULL,
  z0 = ifelse(has.col(dat, "z0"), dat[["z0"]], 1.96),
  repl = 1,
  aggregate.fun = "median",
  ab.df = NULL,
  z.pdf = NULL,
  max.s = 100,
  common.deci = TRUE,
  verbose = TRUE
)
study.with.derounding(
  dat,
  h.seq = c(0.05, 0.075, 0.1, 0.2, 0.3, 0.4, 0.5),
  window.fun = window.t.ci,
  mode = c("reported", "uniform", "zda", "dsr")[1],
  alt.mode = c("uniform", "reported")[1],
  make.z.fun = NULL,
  z0 = ifelse(has.col(dat, "z0"), dat[["z0"]], 1.96),
  repl = 1,
  aggregate.fun = "median",
  ab.df = NULL,
  z.pdf = NULL,
  max.s = 100,
  common.deci = TRUE,
  verbose = TRUE
)

Arguments

`dat`	a data frame containing all observations. Each observation is a test from a regression table in some article. It must have the columns `mu` (reported coefficient) and `sigma` (reported standard error). The optional column `no.deround` can specify rows whose z statistic shall never be derounded. `dat` can also have the columns z, num.deci, mu.deci and sigma.deci. If those columns do not exist, they will be computed from mu and sigma.
`h.seq`	All considered half-window sizes
`window.fun`	The function that computes for each draw of a derounded z vector and a window h the statistics of interest. Examples are `window.t.ci` (DEFAULT) or `window.binom.test`. Not that our implementation of dsr derounding (or any other derounding using `ab.df`) does not draw derounded z, but only creates a logical vector `above` indicating which draws are above or below the z0 threshold. This means if you write a custom function, it should essentially work on that vector.
`mode`	Mode how a single draw of derounded z is computed: "reported", "uniform","zda","dsr" or some custom name (requires ab.df to be defined)
`alt.mode`	Either "uniform" (DEFAULT) or "reported". Some derounding modes like "zda" and "dsr" cannot be well defined (or are too time-consuming to compute) for observations with many significant digits or outlier z-statistics. `alt.mode` specifies how z values shall be selected for those observations.
`z0`	The significance threshold for z
`repl`	Number of replications of each derounding draw.
`aggregate.fun`	How shall multiple replications be aggregated. Not yet implemented. Currently we always take the medians of each variale returned by window.fun of all replications.
`ab.df`	Required if `mode=="dsr"` or some custom mode. See e.g. `dsr.ab.df`.
`z.pdf`	Required if `mode=="zda"`. Should be generated via `make.z.pdf`.
`max.s`	Used if `mode=="zda"`. Specifies the maximum significand for which zda derounding shall be performed. For observations with larger significand s, uniform derounding will be performed.
`common.deci`	Shall we assume that mu and sigma are given with the same number of decimal places. If `TRUE` (Default) take the column `num.deci` i present in `dat` or create it as the pairwise maximum of the decimal places of mu and sigma. If `FALSE`, either use the columns `mu.deci` and `sigma.deci` if present in `dat` or generate them from mu and sigma.

Apply on windows one-sided binomiminal test with H0: z <= z0

Description

Apply on windows one-sided binomiminal test with H0: z <= z0

Usage

## S3 method for class 'binom.test'
window(above = z >= z0, h = NA, ci.level = 0.95, z, z0, ...)
## S3 method for class 'binom.test'
window(above = z >= z0, h = NA, ci.level = 0.95, z, z0, ...)

Apply on windows two sided binomiminal test with H0: z = z0

Description

Apply on windows two sided binomiminal test with H0: z = z0

Usage

## S3 method for class 'binom.test.2s'
window(above = z >= z0, h = NA, ci.level = 0.9, z, z0, ...)
## S3 method for class 'binom.test.2s'
window(above = z >= z0, h = NA, ci.level = 0.9, z, z0, ...)

Window function returning estimated probability that a z-statistic is above a threshold z0 in a window with half-width h around z0 and t-test confidence intervals

Description

Can be used as argument window.fun in compute.with.derounding

Usage

## S3 method for class 't.ci'
window(above = z >= z0, h = NA, ci.level = 0.95, z, z0, ...)
## S3 method for class 't.ci'
window(above = z >= z0, h = NA, ci.level = 0.95, z, z0, ...)

Package 'RoundingMatters'

Help Index

Density estimates for absolute z-statistics assuming that z-statistics are symmetrically distributed around 0

Description

Usage

Arguments

Perform kernel estimates of two densities of absolute z-statistics and their ratio.

Description

Usage

Arguments

Convert numbers like 0.421 to 42.1%

Description

Usage

Arguments

Draw derounded z assuming missing digits of mu and sigma are uniformly distributed, but adjust for estimated density of z using rejection sampling

Description

Usage

Arguments

Draw derounded z assuming missing digits of mu and sigma are uniformly distributed

Description

Usage

Arguments

Create an ab.df for the dsr approach

Description

Usage

Arguments

Finds observations in dat for which we shall perform dsr adjustment

Description

Usage

Arguments

Compute a normalized pdf from a vector of z-statistics

Description

Usage

Arguments

Compute minimum and maximum possible values of z given rounded mu and sigma

Description

Usage

Arguments

Get the number of significand digits of a floating point number using the character presentation of those numbers of R

Description

Usage

Arguments

Get the number of significand digits of a floating point number using the character presentation of those numbers of R

Description

Usage

Arguments

Get the last significant digit(s) of a floating point number

Description

Usage

Arguments

Compute thresholds for the significant s of the reported standard deviation such that we can rule-out the errors: misclassification, wrong inclusion, wrong exclusion

Description

Usage

Arguments

Value

Assess for observations with reported z-statistic z and a signficand of s for the standard error whether it is at risk of the errors: misclassification, wrong inclusion, wrong exclusion

Description

Usage

Arguments

Value

Summary statistics for rounding risks for different thresholds

Description

Usage

Arguments

Sample derounded z from the uniformely derounded distributon for a given single value of mu and sigma

Description

Usage

Arguments

Sets the last digit of a number x to zero

Description

Usage

Arguments

Get the significands of a numeric vector using the character presentation of those numbers of R

Description

Usage

Arguments

ggplot2 density lines for absolute z-statistics assuming that they are symmetrically distributed around 0

Description

Usage

Arguments