Here is the dummy data ,这是一个通话记录数据表。
这是它的一瞥:
| call_id | customer | company | call_start |
|-----------|--------------|-------------|---------------------|
|1411482360 | 001143792042 | 08444599175 | 2014-07-31 13:55:03 |
|1476992122 | 001143792042 | 08441713191 | 2014-07-31 14:05:10 |
customer
和 company
字段代表他们的电话号码。
- 要求根据以下逻辑计算总“ yield ”和总“损失”值:
编辑:
-客户A调用A公司。
- 如果客户 A 调用公司 B,则公司 B 将获得 +1 yield ,而公司 A 将损失 +1。
-如果客户 A 调用 C 公司,那么 C 公司将获得 +1 yield ,而 B 公司将损失 +1。
-如果客户A再次调用C公司,则溢出/ yield 不会受到影响。
-只有在客户 A 打出第二个电话后, yield /损失才会发挥作用。
- 如果客户按以下顺序调用公司:A、B、B、C、A、A、C、B、D,则流程应如下所示:
A ->
B -> B +1 gain, A +1 lost
B ->
C -> C +1 gain, B +1 lost
A -> A +1 gain, C +1 lost
A ->
C -> C +1 gain, A +1 lost
B -> B +1 gain, C +1 lost
D -> D +1 gain, B +1 lost
经过上述过程,我们应该得到总值:
Company Total gain Total lost
A 1 2
B 2 2
C 2 2
D 1 0
我开始研究这个,但这是错误的,它只是一个想法,它没有根据上述条件给我单独的增量增益和损失值:
DROP TABLE IF EXISTS GetTotalGainAndLost;
CREATE TEMPORARY TABLE IF NOT EXISTS GetTotalGainAndLost
AS
(
SELECT SUM(count) as 'TotalGainAndLost', `date`, DAY(`date`) as 'DAY'
FROM (SELECT count(*) as 'count', customer, `date`
FROM (SELECT customer, company, count(*) AS 'count', DATE_FORMAT(`call_end`,'%Y-%m-%d') as 'date'
FROM calls
WHERE `call_end` LIKE CONCAT(2014, '-', RIGHT(CAST(concat('0', 01) AS CHAR),2),'-%')
GROUP BY customer, company, DAY(`call_end`) ORDER BY `call_end` ASC)
as tbl1 group by customer, `date` having count(*) > 1)
as tbl2 GROUP by `date`
);
Select * from GetTotalGainAndLost;
DROP TABLE GetTotalGainAndLost;
此查询未显示任何结果。
- 所需的输出如下所示:
每个公司和日期应该一行(例如 1 月,按天计算的总 yield 和损失调用)
| company | totalGain | totalLost | date | DAY |
|-------------|------------|-------------|--------------|-------|
| 08444599175 | 17 | 6 | 2014-07-01 | 1 |
| 08444599175 | 12 | 10 | 2014-07-02 | 2 |
| 08444599175 | 3 | 6 | 2014-07-02 | 3 |
| 08444599175 | .... | ... | ... | ... |
| 08444599175 | 7 | 6 | 2014-07-31 | 31 |
最佳答案
简化
让 N 表示公司出现的次数。让我们尝试用三个简单的规则来简化公式。
- 第一个出现的公司将有N - 1个 yield ,N个损失。
- 中型公司将有 N 次 yield ,N 次损失。
- 最后一家公司将有 N 次 yield ,N - 1 次损失
测试
在你的例子中:
- 从A公司开始,出现3次。
- B公司出现3次
- C公司出现2次
- 以出现 1 次的公司 D 结尾。
结果
Company Gain Lost
A 2 3
B 3 3
C 2 2
D 1 0
转换为 SQL
首先我们从统计每家公司的出现次数开始。
SELECT
company, COUNT(*) AS gain, COUNT(*) AS lost, DATE(call_start) AS date
FROM calls
GROUP BY DATE(call_start), company
然后,我们开始为每个客户选择每个公司第一次出现的编号。
SELECT company, -COUNT(*) AS gain, 0 AS lost, DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MIN(call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer
) AS t ON (calls.call_id = t.call_id)
GROUP BY DATE(call_start), calls.company
最后出现的公司数量。
SELECT company, 0 AS gain, -COUNT(*) AS lost, DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MAX (call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer
) AS t ON (calls.call_id = t.call_id)
GROUP BY DATE(call_start), calls.company
结合SQL
最后,我们可以使用 UNION ALL 将整个 SQL 组合在一起,然后进行另一个分组。
SELECT company, SUM(gain) AS gain, SUM(lost) AS lost, `date` FROM (
(
SELECT
company, COUNT(*) AS gain, COUNT(*) AS lost, DATE(call_start) AS `date`
FROM calls
GROUP BY DATE(call_start), company
) UNION ALL (
SELECT company, -COUNT(*) AS gain, 0 AS lost, DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MIN(call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer
) AS t ON (calls.call_id = t.call_id)
GROUP BY DATE(call_start), calls.company
) UNION ALL (
SELECT company, 0 AS gain, -COUNT(*) AS lost, DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MAX(call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer
) AS t ON (calls.call_id = t.call_id)
GROUP BY DATE(call_start), calls.company
)
) AS t
GROUP BY `date`, company
澄清
上面的查询假设每个新的一天都是独立的。例如,
- 客户 A 调用公司 A(第 1 天)
- 客户 A 调用 B 公司(第 1 天)B 获得 1,A 损失 1
- 客户 A 调用 C 公司(第 1 天)C 获得 1,B 损失 1
- 客户 A 调用公司 D(第 2 天)
- 客户 A 调用 E 公司(第 2 天)E 获得 1,D 损失 1
结果是
COM G L DAY
----------------
A 0 1 1
B 1 1 1
C 1 0 1
D 0 1 2
E 1 0 2
关于MySql:通过多个条件获取递增项目的计数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28046327/