python - 在统计注释中自定义 "star"文本格式的 p 值阈值

statannotations 包提供了绘图中数据对的统计显着性水平的可视化注释(例如，在 seaborn 箱线图或条形图中)。这些注释可以采用“星号”文本格式，其中一个或多个星号出现在数据对之间的条形顶部: .

有没有办法自定义星星的阈值？我希望第一个显着性阈值的阈值是 0.0001，而不是 0.05，两颗星 ** 为 0.00001，三颗星 *** 为 0.000001。

示例图是从 example codes 生成的来自 statsannotations 的 github 页面:

from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)
annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

将 verbose 设置为 2 时，这还会告诉我们用于确定条形上方出现多少颗星星的阈值:

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 1.00e-02 < p <= 5.00e-02
      **: 1.00e-03 < p <= 1.00e-02
     ***: 1.00e-04 < p <= 1.00e-03
    ****: p <= 1.00e-04

我想向 Annotator 提供类似 p 值阈值字典:星星数量的内容，但我不知道应该提供什么参数。

最佳答案

在他们的存储库中，特别是在文件[Annotator.py][1]中:，我们有self._pvalue_format = PValueFormat()。这意味着我们可以改变同样的事情。 PValueFormat() 类，可以在 here 找到，具有以下可配置参数:

CONFIGURABLE_PARAMETERS = [
    'correction_format',
    'fontsize',
    'pvalue_format_string',
    'simple_format_string',
    'text_format',
    'pvalue_thresholds',
    'show_test_name'
]

为了完整起见，这里是代码的修改版本和新结果，其中两行显示 p 值的前后值。此外，图像也会相应变化。

# ! pip install statannotations
from smartprint import smartprint as sprint
from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)

print ("Before hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])


annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot._pvalue_format.pvalue_thresholds =  [[0.01, '****'], [0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']]
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

print ("After hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])

输出:

Before hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.0001, '****'],
                       [0.001, '***'],
                       [0.01, '**'],
                       [0.05, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 2.00e-01 < p <= 6.00e-01
      **: 3.00e-02 < p <= 2.00e-01
     ***: 1.00e-02 < p <= 3.00e-02
    ****: p <= 1.00e-02

Thur vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:6.477e-01 U_stat=6.305e+02
Thur vs. Sat: Mann-Whitney-Wilcoxon test two-sided, P_val:4.690e-02 U_stat=2.180e+03
Sun vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:2.680e-02 U_stat=9.605e+02
After hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.01, '****'],
                       [0.03, '***'],
                       [0.2, '**'],
                       [0.6, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

图片:

编辑: 基于用户:Bonlenfum根据注释，更改阈值也可以通过在调用.configure时简单地附加键值来实现，如下所示:

annot.configure(test='Mann-Whitney', text_format='star', loc='outside',\
verbose=2, pvalue_thresholds=[[0.01, '****'], \
[0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']])

关于python - 在统计注释中自定义 "star"文本格式的 p 值阈值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/76932969/

python - 在统计注释中自定义 "star"文本格式的 p 值阈值

上一篇：rust - T <'a>: ' b 是 'a: ' b 的语法糖吗？

下一篇：快速准确计算double的小数指数