We analyze what can be learned from tests for p-hacking based on distributions of t-statistics and p-values across multiple studies. We analytically characterize restrictions on these distributions that conform with the absence of p-hacking. This forms a testable null hypothesis and suggests statistical tests for p-hacking. We extend our results to p-hacking when there is also publication bias, and also consider what types of distributions arise under the alternative hypothesis that researchers engage in p-hacking. We show that the power of statistical tests for detecting p-hacking is low even if p-hacking is quite prevalent.
↧