sql - "Running Product"PostgreSql 中的聚合/窗口函数?

标签 sql postgresql aggregate

我正在尝试在 PostgreSql 中规范化日终股票价格。

假设我有一个这样定义的股票表:

create table eod (
  date date not null,
  stock_id int not null,
  split decimal(16,8) not null,
  close decimal(12,6) not null,
  constraint pk_eod primary key (date, stock_id)
);

此表中的数据可能如下所示:

"date","stock_id","eod_split","close"
"2014-06-13",14010920,"1.00000000","182.560000"
"2014-06-13",14010911,"1.00000000","91.280000"
"2014-06-13",14010923,"1.00000000","41.230000"
"2014-06-12",14010911,"1.00000000","92.290000"
"2014-06-12",14010920,"1.00000000","181.220000"
"2014-06-12",14010923,"1.00000000","40.580000"
"2014-06-11",14010920,"1.00000000","182.250000"
"2014-06-11",14010911,"1.00000000","93.860000"
"2014-06-11",14010923,"1.00000000","40.860000"
"2014-06-10",14010911,"1.00000000","94.250000"
"2014-06-10",14010923,"1.00000000","41.110000"
"2014-06-10",14010920,"1.00000000","184.290000"
"2014-06-09",14010920,"1.00000000","186.220000"
"2014-06-09",14010911,"7.00000000","93.700000"
"2014-06-09",14010923,"1.00000000","41.270000"
"2014-06-06",14010923,"1.00000000","41.480000"
"2014-06-06",14010911,"1.00000000","645.570000"
"2014-06-06",14010920,"1.00000000","186.370000"
"2014-06-05",14010920,"1.00000000","185.980000"
"2014-06-05",14010911,"1.00000000","647.350000"
"2014-06-05",14010923,"1.00000000","41.210000"
... 
"2005-03-04",14010920,"1.00000000","92.370000"
"2005-03-04",14010911,"1.00000000","42.810000"
"2005-03-04",14010923,"1.00000000","25.170000"
"2005-03-03",14010923,"1.00000000","25.170000"
"2005-03-03",14010911,"1.00000000","41.790000"
"2005-03-03",14010920,"1.00000000","92.410000"
"2005-03-02",14010920,"1.00000000","92.920000"
"2005-03-02",14010923,"1.00000000","25.260000"
"2005-03-02",14010911,"1.00000000","44.121000"
"2005-03-01",14010920,"1.00000000","93.300000"
"2005-03-01",14010923,"1.00000000","25.280000"
"2005-03-01",14010911,"1.00000000","44.500000"
"2005-02-28",14010923,"1.00000000","25.160000"
"2005-02-28",14010911,"2.00000000","44.860000"
"2005-02-28",14010920,"1.00000000","92.580000"
"2005-02-25",14010923,"1.00000000","25.250000"
"2005-02-25",14010920,"1.00000000","92.800000"
"2005-02-25",14010911,"1.00000000","88.990000"
"2005-02-24",14010923,"1.00000000","25.370000"
"2005-02-24",14010920,"1.00000000","92.640000"
"2005-02-24",14010911,"1.00000000","88.930000"
"2005-02-23",14010923,"1.00000000","25.200000"
"2005-02-23",14010911,"1.00000000","88.230000"
"2005-02-23",14010920,"1.00000000","92.100000"
...
"2003-02-24",14010920,"1.00000000","78.560000"
"2003-02-24",14010911,"1.00000000","14.740000"
"2003-02-24",14010923,"1.00000000","24.070000"
"2003-02-21",14010920,"1.00000000","79.950000"
"2003-02-21",14010923,"1.00000000","24.630000"
"2003-02-21",14010911,"1.00000000","15.000000"
"2003-02-20",14010911,"1.00000000","14.770000"
"2003-02-20",14010920,"1.00000000","79.150000"
"2003-02-20",14010923,"1.00000000","24.140000"
"2003-02-19",14010920,"1.00000000","79.510000"
"2003-02-19",14010911,"1.00000000","14.850000"
"2003-02-19",14010923,"1.00000000","24.530000"
"2003-02-18",14010923,"2.00000000","24.960000"
"2003-02-18",14010911,"1.00000000","15.270000"
"2003-02-18",14010920,"1.00000000","79.330000"
"2003-02-14",14010911,"1.00000000","14.670000"
"2003-02-14",14010920,"1.00000000","77.450000"
"2003-02-14",14010923,"1.00000000","48.300000"
"2003-02-13",14010920,"1.00000000","75.860000"
"2003-02-13",14010911,"1.00000000","14.540000"
"2003-02-13",14010923,"1.00000000","46.990000"

注意“拆分”列。当记录的拆分值不是 1 时,这基本上意味着股票按该因素拆分。 IOW,当拆分为 2.0 时,流通股数量翻了一番,但从那时起每股值(value)减半。如果股票每股值(value) 100 美元,那么它现在每股值(value) 50 美元。

如果你用原始数字绘制图表,这种事情真的很难看。当公司的整体值(value)没有显着变化时,就会出现陡峭的悬崖……当你进行多次拆分时,你最终得到的图表不能正确反射(reflect)公司的趋势,而且幅度通常很大。在上面的示例中,如果有 2:1 的拆分,您的股票收盘价将类似于 100、100、100、50、50、50。

我想使用此表以相当有效的方式创建“规范化”价格(有相当多的记录要分块)。继续示例,这将显示股票价格为 50、50、50、50、50、50。如果有多个拆分,如果我们忽略实际市场值(value)变化,数据应该仍然一致且平滑。

我的想法是,如果我可以创建拆分值的“运行产品”聚合的 CTE,回到过去,我可以定义每只股票的日期范围以及应用于收盘成本的修正值应该是多少,然后将其连接回 eod 表,并在新表中选择每只股票的调整后收盘价。

...问题是,除了一大堆临时表和多步骤流程之外,我无法全神贯注于如何做到这一点。我也不知道有什么内置功能可以使这更容易。

谁能告诉我如何生成标准化数据?

最佳答案

您不需要 CTE。您只需要一个累积产品。 Postgres 没有内置的。但是,算术来拯救!

select eod.*,
       exp(sum(ln(eod_split)) over (partition by stock_id order by date)) as cume_split,
       (close *
        exp(sum(ln(eod_split)) over (partition by stock_id order by date))
       ) as normalized_price
from eod;

关于sql - "Running Product"PostgreSql 中的聚合/窗口函数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45247458/

相关文章:

django - 在 Postgres 中使用 jsonb(与 Django 的 JSONField 一起使用)唯一索引实际上做了什么?

java - Postgres 异常 : No results were returned by the query.

node.js - 如何使用聚合函数计算mongoDB中的唯一数据

sql - 按字段值使用 Google BigQuery 拆分时间序列

postgresql - 为什么从 postgres 选择所有表会根据所使用的语法给出不同的结果

mysql - 选择过去 2 年或任何一年每月的同一天

python - 如何使用 pandas 将 Datetime 对象聚合为 int

mongodb - 如何使用聚合在一个mongo文档中独立排序多个数组

php - 是我做错了什么还是 php 不支持这个?

sql - Notepad++/Eclipse sql代码自动缩进选项?