我有两张 table 。一个是股票价格,一个是每只股票的股票数量。我想加入这两个表并计算每只股票的市值。
这是一个示例数据表,其中包含我创建的 3 只股票以重现该问题。
CREATE TABLE stock_prices (country_exchange_code VARCHAR(2), stock_code VARCHAR(4), date DATE, close FLOAT, PRIMARY KEY (country_exchange_code,stock_code,date));
INSERT INTO stock_prices VALUES
("T", "1301", '2019-10-29', 75.2),
("T", "1301", '2019-10-30', 76.6),
("T", "1301", '2019-10-31', 77.6),
("T", "1301", '2019-11-01', 77.2),
("T", "1332", '2019-10-29', 52.5),
("T", "1332", '2019-10-30', 49.7),
("T", "1332", '2019-10-31', 50.8),
("T", "1332", '2019-11-01', 50.4),
("T", "1333", '2019-10-29', 13.9),
("T", "1333", '2019-10-30', 13.8),
("T", "1333", '2019-10-31', 14.3),
("T", "1333", '2019-11-01', 14.4);
CREATE TABLE stock_shares (country_exchange_code VARCHAR(2), stock_code VARCHAR(4), Num_Shares INT, PRIMARY KEY (country_exchange_code,stock_code));
INSERT INTO stock_shares VALUES
("T", "1301", 241587962),
("T", "1332", 369875187),
("T", "1333", 958621587);
以下查询连接了关于国家代码和股票代码的两个表,然后列出了股票数量和最后收盘价,这些是计算市值的输入。我使用 last_value 窗口函数获取最后收盘价。
SELECT Stock_Code, Date, Num_Shares,
last_value(Close) OVER (PARTITION BY Stock_Code ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Last_Close,
Num_Shares * last_value(Close) OVER (PARTITION BY Stock_Code ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Mkt_Cap
FROM stock_prices LEFT JOIN stock_shares USING (Country_Exchange_Code, Stock_Code)
WHERE Country_Exchange_Code = 'T' AND Date >= '2019-10-29'
ORDER BY Stock_Code, Date;
这按预期工作并产生以下结果:
结果 1:
接下来,我想使用 DISTINCT 语句为每只股票得出一行。但是,我首先需要删除除 Stock_Code 和 Mkt_Cap 之外的所有列。这就是问题发生的地方。当我从 select 语句中删除 Last_Close 列时:
SELECT Stock_Code, Date, Num_Shares,
Num_Shares * last_value(Close) OVER (PARTITION BY Stock_Code ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Mkt_Cap
FROM stock_prices LEFT JOIN stock_shares USING (Country_Exchange_Code, Stock_Code)
WHERE Country_Exchange_Code = 'T' AND Date >= '2019-10-29'
ORDER BY Stock_Code, Date;
我在每个股票代码分区的第一行中弹出了这些意外的 NULL。
结果 2:
为什么会这样?我的表中没有 NULL,正如我们从第一个结果中看到的那样,计算 Mkt_Cap 所需的所有数据都在那里。
额外信息: 当我从 SELECT 语句中删除 Date 和/或 Num_Shares 时没有问题。只是删除了导致问题的 last_value 函数。
有趣的是,删除 WHERE 子句后问题就消失了。我不明白这是如何影响结果的,因为在我的小样本中,这个 WHERE 子句甚至什么都不做。我所有的数据都有 Country_Exchange_Code = 'T' 并且 Date >= '2019-10-29'。但是在我拥有数百万行的实际数据集中,这个 WHERE 子句是非常必要的。因此,删除 WHERE 子句不是解决方案。
最佳答案
我不明白,你还能做什么,我认为这仍然是一个错误。 要规避它:
SELECT
Stock_Code, `Date`, Num_Shares, (Num_Shares * Mkt_Cap) Mkt_Cap
FROM
(SELECT Stock_Code, Date, Num_Shares, Close,
(last_value(Close) OVER (PARTITION BY Stock_Code
ORDER BY `Date`
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) AS Mkt_Cap
FROM stock_prices3 LEFT JOIN stock_shares3 USING (Country_Exchange_Code, Stock_Code)
WHERE Country_Exchange_Code = 'T' AND Date >= '2019-10-29'
) t1
ORDER BY Stock_Code, `Date`;
就像在底部的最后一个选择中看到的那样 https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=43308a7caac9e804e6a65d48b3fa7490
关于带有 LEFT JOIN 的 MySQL SELECT 意外地将 NULL 插入每个分区的第一行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58681085/