作者celestialgod (天)
看板Statistics
标题[讨论] statement of ASA on p-values
时间Thu Mar 10 23:57:08 2016
最近ASA发了一篇关於统计显着性跟p-values的陈述
先给上摘要跟全文(连结为我个人的dropbox,可不登入直接浏览):
简短的摘要:
http://tinyurl.com/zswny43
The ASA's statement on p-values: context, process, and purpose:
http://tinyurl.com/zw5yyum
当中最重要的应该是提到以下六个原则:
1. P-values can indicate how incompatible the data are with a specified
statistical model.
2. P-values do not measure the probability that the studied hypothesis is
true, or the probability that the data were produced by random chance
alone.
3. Scientific conclusions and business or policy decisions should not be
based only on whether a p-value passes a specific threshold.
4. Proper inference requires full reporting and transparency.
5. A p-value, or statistical significance, does not measure the size of an
effect or the importance of a result.
6. By itself, a p-value does not provide a good measure of evidence regarding
a model or hypothesis.
个人的一点小心得:
例如,第一点是说P-values can indicate how incompatible the data are with a
specified statistical model.,就是说他是在比跟null hypothesis所指定的统计分配
差异,感觉是在指谪像检定是否为特定分配的检定,这就是明显被误用。(这一点也在全文
的第9页,第二点中被提及,原文:Researchers often wish to turn a p-value into
a statement about the truth of a null hypothesis, or about the probability
that random chance produced the observed data. The p-value is neither。)
最常见的就是常态性检定,做常态性检定得到p-values > 0.1,就宣称他的资料是
来自常态,他的虚无假设是这资料是常态,根据第一点,你检定的是跟常态的不接近
程度,而非是否为常态,这个说明得非常小心;我看到一篇论文的标题,觉得颇有趣,
跟大家分享一下:Absence of evidence is not evidence of absence.
这其实是这次ASA的重点之一,不能说缺乏证据证明null hypothesis,就说是
null hypothesis就是对的,如同常态性检定一样,p-value > 0.1时,
结论是你没证据显示资料来自非常态,不代表资料来自常态一样。
(Absence of evidence: 没证据表明非常态) (evidence of absence: 常态的证据)
(第二点的解释也有提及:It is a statement about data in relation to a
specified hypothetical explanation, and is not a statement about the
explanation itself.)
第五点也很重要:A p-value, or statistical significance, does not measure the
size of an effect or the importance of a result. p-values不能拿来比较重要性的
程度,p-values不代表越重要。ASA给了一个其他方式去衡量第五点,像是confidence,
credibility, or prediction intervals; Bayesian methods; alternative measures
of evidence, such as likelihood ratios or Bayes Factors。其全文如下:
In view of the prevalent misuses of and misconceptions concerning p-values,
some statisticians prefer to supplement or even replace p-values with other
approaches. These include methods that
emphasize estimation over testing,
such as confidence, credibility, or prediction intervals; Bayesian methods;
alternative measures of evidence, such as likelihood ratios or Bayes Factors;
and other approaches such as decision-theoretic modeling and false discovery
rates.
All these measures and approaches rely on further assumptions, but
they may
more directly address the size of an effect (and its associated
uncertainty) or whether the hypothesis is correct.
不知道大家对ASA这篇statement有没有什麽想法?
3/11早上看到的一篇部落格文章,阐述一些p-value的价值所在:
http://tinyurl.com/jebjua6
--
※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 180.218.152.118
※ 文章网址: https://webptt.com/cn.aspx?n=bbs/Statistics/M.1457625431.A.E49.html
1F:推 allen1985: 我觉得这篇算是相当"中肯"的文章 值得读一下 03/11 03:21
2F:→ allen1985: 近年Anti-p-value的人很多 但有些批评又太过了 毕竟 03/11 03:22
这几天看到R blogger有一篇文章写 ASA says NO to p-values....
这真的是太夸张了~"~,我会倾向ASA在阐述p-values的价值,以及校正观念
3F:→ allen1985: 统计还是得有个下结论的办法 03/11 03:22
4F:→ allen1985: 我一直想问的一个问题 p-value = 0.8 跟 p-value = 0.6 03/11 03:23
5F:→ allen1985: 有没有差异 以及 p-value = 0.01 跟 p-value = 0.00001 03/11 03:24
6F:→ allen1985: 有没有差异 03/11 03:24
光是比p-value这件事本身就是没意义了,更遑论它们有没有差异?
7F:→ andrew43: 回楼上,我觉得这种比较单比没太多意思,还是要再参考 03/11 07:27
8F:→ andrew43: 其它指标吧,例如effect size或Bayes factor。 03/11 07:28
9F:→ andrew43: 不然要说有差也有差,但做结论要说没差也没差的感觉。 03/11 07:31
我倒是满好奇文章提到的likelihood ratios,是因为likelihood ratios是在两个假设下
的likelihood比值,所以会比较适合拿来做measure of evidence吗?
不像一般假设检定是null跟alternative互为相反。
10F:→ allen1985: 我主要是想说 大部分的文章 现在都认为 不显着就是不 03/11 08:11
11F:→ allen1985: 显着 不显着的两个p-values是不能直接比较的 但还是 03/11 08:11
12F:→ allen1985: 满多人会拿来比的 03/11 08:11
13F:→ allen1985: 我绝对赞同 很多东西不能单看p-value 需要其他指标 图 03/11 08:12
14F:→ allen1985: 才能下结论 03/11 08:12
这就是第三点了,不该用p-value做一翻两瞪眼的推论,但是p-value其实无法做这件事
第三点的部分摘录:
Pragmatic considerations often require binary, “yes-no” decisions, but this
does not mean that p-values alone can ensure that a decision is correct or
incorrect. The widespread use of “statistical significance” (generally
interpreted as “p <= 0.05”) as a license for making a claim of a scientific
finding (or implied truth) leads to considerable distortion of the scientific
process.
15F:→ KirinGuess: 所以原po是不同意文章的第一个论点? 03/12 19:22
16F:→ KirinGuess: 认为第一个论点和文章其他内容冲突? 03/12 19:22
我是觉得第一点说得很好啊XD,常态性检定就是常见的误用
17F:推 milk0925: 这篇对我帮助超大的,感谢分享! 03/12 22:18
不客气
※ 编辑: celestialgod (180.218.152.118), 03/12/2016 22:31:09
※ 编辑: celestialgod (140.109.74.87), 04/19/2016 15:23:49