我正在为我们的网站开发一个销售排名小部件,它将根据产品在“热门排行榜”或“畅销书”列表(可以这么说)中的当前位置显示产品排序。
阅读一些内容后,看起来实现此方法的一个好方法就是滚动平均销售算法,其中销售越近,其权重就越高。
示例:
$rolling_avg = ((4*$d1)+(3*$d2)+(2*$d3)+$d4+$d5+$d6+$d7)/13;
地点:
- $d1 = 过去 24 小时内的销售数量。
- $d2 = 过去 24-48 小时内的销售数量。
- $d3 = 过去 48-72 小时内的销售数量。
- $d4 = 过去 72-96 小时内的销售数量。
等等……
目前,我正在尝试在包含大约 50 万条记录的产品数据集上运行此程序,将计算出的排名重新插入到产品表中,以便稍后查询。如果可能的话,我希望能够创建一个脚本来重新计算排名并每 12 或 24 小时在 cron 上运行一次。
当前实现:
我当前的实现执行时间太长,而且我觉得需要在 SQL 级别完成更多处理(使用更少的 SELECT 查询),但我不确定如何着手开始它。
$products = mysql_query("SELECT * FROM products ORDER BY id DESC"); // <-- Est 450-500k rows.
while($product = mysql_fetch_array($products)) {
$product_id = $product['id'];
$d1 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 24 HOUR) AND NOW())") or die(mysql_error);
$d1 = mysql_fetch_array($d1);
$d2 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 48 HOUR) AND (NOW()-INTERVAL 24 HOUR))");
$d2 = mysql_fetch_array($d2);
$d3 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 72 HOUR) AND (NOW()-INTERVAL 48 HOUR))");
$d3 = mysql_fetch_array($d3);
$d4 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 96 HOUR) AND (NOW()-INTERVAL 72 HOUR))");
$d4 = mysql_fetch_array($d4);
$d5 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 120 HOUR) AND (NOW()-INTERVAL 96 HOUR))");
$d5 = mysql_fetch_array($d5);
$d6 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 144 HOUR) AND (NOW()-INTERVAL 120 HOUR))");
$d6 = mysql_fetch_array($d6);
$d7 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 168 HOUR) AND (NOW()-INTERVAL 144 HOUR))");
$d7 = mysql_fetch_array($d7);
$d1 = $d1[0];
$d2 = $d2[0];
$d3 = $d3[0];
$d4 = $d4[0];
$d5 = $d5[0];
$d6 = $d6[0];
$d7 = $d7[0];
$rolling_avg = ((4*$d1)+(3*$d2)+(2*$d3)+$d4+$d5+$d6+$d7)/13;
mysql_query("UPDATE products SET rolling_sales = '$rolling_avg' WHERE id = '$product_id'");
}
不确定如何从这里优化/进步。但这肯定需要很多的工作。
在提到它之前,我知道 mysql_*
函数已过时,我会在它进入生产环境之前将其移至 PDO。
最佳答案
这是一个使用单个查询计算滚动销售额的函数。
function get_rolling_sales($product_id) {
$query = <<<EOF
SELECT (
4 * (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 24 HOUR) AND NOW())
) + 3 * (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 48 HOUR) AND (NOW()-INTERVAL 24 HOUR))
) + 2 * (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 72 HOUR) AND (NOW()-INTERVAL 48 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 96 HOUR) AND (NOW()-INTERVAL 72 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 120 HOUR) AND (NOW()-INTERVAL 96 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 144 HOUR) AND (NOW()-INTERVAL 120 HOUR))
) + (
SELECT COUNT(*) FROM orders WHERE (product_id = $product_id)
AND (sale_completed BETWEEN (NOW()-INTERVAL 168 HOUR) AND (NOW()-INTERVAL 144 HOUR))
)
) / 13 AS rolling_sales
EOF;
$result = mysql_query($query);
$row = mysql_fetch_assoc($result);
return $row['rolling_sales'];
}
然而,遍历所有 500.000 条产品记录仍然需要很长时间。您真的一次需要所有这些信息(例如用于计算)还是计划在分页 TableView 中显示这些信息?如果您只想显示数据,按需计算 rolling_sales 应该没问题。
关于php - 滚动销售平均算法实现 MySQL + PHP,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13925205/