statannotations 包提供了绘图中数据对的统计显着性水平的可视化注释(例如,在 seaborn 箱线图或条形图中)。这些注释可以采用“星号”文本格式,其中一个或多个星号出现在数据对之间的条形顶部: .
有没有办法自定义星星的阈值?我希望第一个显着性阈值的阈值是 0.0001,而不是 0.05,两颗星 ** 为 0.00001,三颗星 *** 为 0.000001。
示例图是从 example codes 生成的来自 statsannotations 的 github 页面:
from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)
annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')
将 verbose
设置为 2
时,这还会告诉我们用于确定条形上方出现多少颗星星的阈值:
p-value annotation legend:
ns: p <= 1.00e+00
*: 1.00e-02 < p <= 5.00e-02
**: 1.00e-03 < p <= 1.00e-02
***: 1.00e-04 < p <= 1.00e-03
****: p <= 1.00e-04
我想向 Annotator 提供类似 p 值阈值字典:星星数量的内容,但我不知道应该提供什么参数。
最佳答案
在他们的存储库中,特别是在文件[Annotator.py][1]
中:,我们有self._pvalue_format = PValueFormat()
。这意味着我们可以改变同样的事情。 PValueFormat()
类,可以在 here 找到,具有以下可配置参数:
CONFIGURABLE_PARAMETERS = [
'correction_format',
'fontsize',
'pvalue_format_string',
'simple_format_string',
'text_format',
'pvalue_thresholds',
'show_test_name'
]
为了完整起见,这里是代码的修改版本和新结果,其中两行显示 p 值的前后值。此外,图像也会相应变化。
# ! pip install statannotations
from smartprint import smartprint as sprint
from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)
print ("Before hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])
annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot._pvalue_format.pvalue_thresholds = [[0.01, '****'], [0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']]
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')
print ("After hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])
输出:
Before hardcoding pvalue thresholds
Dict: annot.get_configuration()["pvalue_format"]
Key: Value
{'correction_format': '{star} ({suffix})',
'fontsize': 'medium',
'pvalue_format_string': '{:.3e}',
'pvalue_thresholds': [[0.0001, '****'],
[0.001, '***'],
[0.01, '**'],
[0.05, '*'],
[1, 'ns']],
'show_test_name': True,
'simple_format_string': '{:.2f}',
'text_format': 'star'}
p-value annotation legend:
ns: p <= 1.00e+00
*: 2.00e-01 < p <= 6.00e-01
**: 3.00e-02 < p <= 2.00e-01
***: 1.00e-02 < p <= 3.00e-02
****: p <= 1.00e-02
Thur vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:6.477e-01 U_stat=6.305e+02
Thur vs. Sat: Mann-Whitney-Wilcoxon test two-sided, P_val:4.690e-02 U_stat=2.180e+03
Sun vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:2.680e-02 U_stat=9.605e+02
After hardcoding pvalue thresholds
Dict: annot.get_configuration()["pvalue_format"]
Key: Value
{'correction_format': '{star} ({suffix})',
'fontsize': 'medium',
'pvalue_format_string': '{:.3e}',
'pvalue_thresholds': [[0.01, '****'],
[0.03, '***'],
[0.2, '**'],
[0.6, '*'],
[1, 'ns']],
'show_test_name': True,
'simple_format_string': '{:.2f}',
'text_format': 'star'}
图片:
编辑:
基于用户:Bonlenfum根据注释,更改阈值也可以通过在调用.configure
时简单地附加键值来实现,如下所示:
annot.configure(test='Mann-Whitney', text_format='star', loc='outside',\
verbose=2, pvalue_thresholds=[[0.01, '****'], \
[0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']])
关于python - 在统计注释中自定义 "star"文本格式的 p 值阈值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76932969/