Data Science面试中最常见的5个统计概念

DataIst · 发表于 2022-11-10 22:41:55

本帖最后由 DataIst 于 2022-11-14 09:11 编辑

DS面试中一定会被问到的统计概念有哪些？Power， Type I error， Type II error，Confidence interval，P value多多少少会涉及到。这里跟大家分享一下，如何更好地跟technical audience，特别是non-technical audidence解释这些概念。
针对technical audience的回答一般包括：什么时候会用到这个概念、给出明确定义、数值变化意味着什么以及一些应用（optional)。针对non-technical audience的回答最好使用例子说明（可以提前准备好），不要引入更多的technical terms。

在binary hypothesis test中使用
alternative hypothesis为真时，拒绝null hypothesis的概率（likelihood that a test will detect an effect when the effect is present）
statistical power越高越好
experiment design中用来计算能够detect an effect的最小sample size

Type I error (for technical):
也指false positive, binary hypothesis test中的一种错误
错误拒绝null hypothesis，也就是说得出significant的结论，但实际上只是occur by chance
值越小越好（越大说明test越不可靠）
A/B testing中经常使用.

Type II error (for technical)：
也指false negative, binary hypothesis test中的一种错误
没有拒绝实际为假的null hypothesis，也就是说得出not significant的结论，但实际是significant
值越小越好（越大说明test越不可靠）
A/B testing中经常使用

Power, Type I error, Type II error (for non-technical，例子说明)：
有人想测试一下自己有没有被coronavirus感染。
一种情况是，他被感染了，并且测试也测出来了。Power指的就是the chance that the test tells us a person is infected if he truly is.
第二种情况是，他没被感染，但测试却显示阳性。这就是type I error，这种错误可能很严重，因为他可能接受一些不必要的治疗。
第三种情况是，他被感染了，但测试却显示阴性。这就是type II error，这种错误的后果是他会错过治疗的最佳时机。
所以我们能看到，power越高，test越有效。type I error and type II error越小，test越好。

Confidence Interval：
For technical
CI是通过samples来估计true value的，但true value我们永远不会知道。CI是一个范围，我们可以知道how variable a sample result might be。
Confidence level（通常为95%）是CI包含true value的概率（通过多次抽样计算）。
Confidence level越高，CI越宽。样本量越少，CI越宽。
一定不能把CI解释为the probability that the true value lies within a certain threshold，因为true value不是variable，CI才是。
For non-technical
假如我们想知道美国男性的平均身高，可以随机选出30个男性并测量他们的平均身高。然后我们可以得到一个95%的CI，比如168 to 185 cm.
这个interval很大可能会涵盖美国男性平均身高的真实值。多大可能呢？如果我们把sample的过程不断重复，计算得出的CI有95%的时间会包含这个真实值。

P value:
For technical
假设检验中使用
是一个条件概率，measures the probability of getting testing results at least as extreme as observed results, given that the null hypothesis is true.
P-value <=0.05，拒绝null hypothesis；p-value > 0.05，不拒绝null hypothesis.
A/B testing中常见，通常测试一个metric在treatment group和control group之间是否不同。测试结果中p-value越小，我们越能确定两个group不同。
一定不能把p-value解释为，given the observation, the probability of there is at least such a difference between the two groups.
For non-technical. 1point 3 acres
同样使用平均身高的例子，随机选出30个男性并测量他们的平均身高，这时我们想知道true value是不是等于一个固定值，比如说175cm。
P-value指的是，假如true value是175cm，我们观测到这个样本的可能性。p-value特别小，说明可能性越小，即true value不太可能是175cm。

		自动登录	找回密码
密码			立即注册