sql - 使用 SQL Server 设计排行榜

标签 sql sql-server database database-design azure-sql-database

我正在为我的一些在线游戏制作排行榜。这是我需要对数据执行的操作:

  • 在多个时间范围内(今天、上周、所有时间等)获取给定游戏的玩家排名
  • 获取分页排名(例如,过去 24 小时的最高分,获取排名 25 到 50 之间的玩家,获取排名或单个用户)

我定义了下表定义和索引,我有几个问题。

考虑到我的场景,我有一个好的主键吗?我之所以在 gameId、playerName 和 score 上有一个聚集键,仅仅是因为我想确保给定的所有数据游戏在同一区域,并且该分数已经排序。大多数时候,我将显示数据是给定 gameId 的得分降序(+ updatedDateTime for ties)。这是正确的策略吗?换句话说,我想确保我可以运行查询以尽快获得我的玩家的排名。

CREATE TABLE score (
    [gameId]            [smallint] NOT NULL,
    [playerName]        [nvarchar](50) NOT NULL,
    [score]             [int] NOT NULL,
    [createdDateTime]   [datetime2](3) NOT NULL,
    [updatedDateTime]   [datetime2](3) NOT NULL,
PRIMARY KEY CLUSTERED ([gameId] ASC, [playerName] ASC, [score] DESC, [updatedDateTime] ASC)

CREATE NONCLUSTERED INDEX [Score_Idx] ON score ([gameId] ASC, [score] DESC, [updatedDateTime] ASC) INCLUDE ([playerName])

下面是我将用来获取玩家排名的查询的第一次迭代。然而,我对执行计划有点失望(见下文)。 为什么SQL需要排序? 额外的排序好像来自于RANK函数。但是我的数据不是已经按降序排序了吗(基于分数表的聚簇键)?我还想知道我是否应该对我的表进行更多规范化并移出 Player 表中的 PlayerName 列。我最初决定将所有内容都保存在同一个表中,以尽量减少连接数。

DECLARE @GameId AS INT = 0
DECLARE @From AS DATETIME2(3) = '2013-10-01'

SELECT DENSE_RANK() OVER (ORDER BY Score DESC), s.PlayerName, s.Score, s.CountryCode, s.updatedDateTime
FROM [mrgleaderboard].[score] s
WHERE s.GameId = @GameId 
  AND (s.UpdatedDateTime >= @From OR @From IS NULL)

enter image description here

谢谢你的帮助!

最佳答案

[更新]

主键不好

您有一个独特的实体,即 [GameID] + [PlayerName]。复合聚集索引 > 120 字节,带有 nvarchar。在相关主题中寻找@marc_s 的答案 SQL Server - Clustered index design for dictionary

您的表架构与您对时间段的要求不匹配

例如:我在星期三获得了 300 分,这个分数存储在排行榜上。第二天我获得了 250 分,但它不会记录在排行榜上,如果我向星期二排行榜运行查询,你也不会得到结果

有关完整信息,您可以从历史桌面游戏得分中获得,但它可能非常昂贵

CREATE TABLE GameLog (
  [id]                int NOT NULL IDENTITY
                      CONSTRAINT [PK_GameLog] PRIMARY KEY CLUSTERED,
  [gameId]            smallint NOT NULL,
  [playerId]          int NOT NULL,
  [score]             int NOT NULL,
  [createdDateTime]   datetime2(3) NOT NULL)

以下是与聚合相关的加速它的解决方案:

  • 历史表的索引 View (请参阅@Twinkles 的 post)。

对于 3 个时间段,您需要 3 个索引 View 。可能有巨大的历史表和 3 个索引 View 。无法删除表的“旧”期间。保存分数的性能问题。

  • 异步排行榜

保存在历史表中的分数。 SQL 作业/“Worker”(或多个)根据计划(每分钟 1 个?)对历史表进行排序,并使用预先计算的用户排名填充排行榜表(3 个表用于 3 个时间段或一个具有时间段键的表)。该表也可以非规范化(有分数、日期时间、PlayerName 和...)。优点:阅读速度快(无需排序),保存分数快,时间段任意,逻辑灵活,时间安排灵活。缺点:用户已经完成游戏但没有立即发现自己在排行榜上

  • 预聚合排行榜

在记录游戏 session 结果的过程中做预处理。在你的情况下,类似于 UPDATE [Leaderboard] SET score = @CurrentScore WHERE @CurrentScore > MAX (score) AND ... 用于玩家/游戏 ID 但你只为“所有时间”排行榜.该方案可能如下所示:

CREATE TABLE [Leaderboard] (
    [id]                int NOT NULL IDENTITY
                             CONSTRAINT [PK_Leaderboard] PRIMARY KEY CLUSTERED,
    [gameId]            smallint NOT NULL,
    [playerId]          int NOT NULL,
    [timePeriod]        tinyint NOT NULL,   -- 0 -all time, 1-monthly, 2 -weekly, 3 -daily
    [timePeriodFrom]    date NOT NULL,  -- '1900-01-01' for all time, '2013-11-01' for monthly, etc.
    [score]             int NOT NULL,
    [createdDateTime]   datetime2(3) NOT NULL
    )
playerId    timePeriod  timePeriodFrom  Score
----------------------------------------------
1           0           1900-01-01      300  
...
1           1           2013-10-01      150
1           1           2013-11-01      300
...
1           2           2013-10-07      150
1           2           2013-11-18      300
...
1           3           2013-11-19      300
1           3           2013-11-20      250
...

So, you have to update all 3 score for all time period. Also as you can see leaderboard will contain "old" periods, such as monthly of October. Maybe you have to delete it if you do not need this statistics. Pros: Does not need a historical table. Cons: Complicated procedure for storing the result. Need maintenance of leaderboard. Query requires sorting and JOIN

CREATE TABLE [Player] (
    [id]    int NOT NULL IDENTITY CONSTRAINT [PK_Player] PRIMARY KEY CLUSTERED,
    [playerName]        nvarchar(50) NOT NULL CONSTRAINT [UQ_Player_playerName] UNIQUE NONCLUSTERED)

CREATE TABLE [Leaderboard] (
    [id]                int NOT NULL IDENTITY CONSTRAINT [PK_Leaderboard] PRIMARY KEY CLUSTERED,
    [gameId]            smallint NOT NULL,
    [playerId]          int NOT NULL,
    [timePeriod]        tinyint NOT NULL,   -- 0 -all time, 1-monthly, 2 -weekly, 3 -daily
    [timePeriodFrom]    date NOT NULL,  -- '1900-01-01' for all time, '2013-11-01' for monthly, etc.
    [score]             int NOT NULL,
    [createdDateTime]   datetime2(3) 
)

CREATE UNIQUE NONCLUSTERED INDEX [UQ_Leaderboard_gameId_playerId_timePeriod_timePeriodFrom] ON [Leaderboard] ([gameId] ASC, [playerId] ASC, [timePeriod]  ASC,  [timePeriodFrom] ASC)
CREATE NONCLUSTERED INDEX [IX_Leaderboard_gameId_timePeriod_timePeriodFrom_Score] ON [Leaderboard] ([gameId] ASC, [timePeriod]  ASC,  [timePeriodFrom] ASC, [score] ASC)
GO

-- Generate test data
-- Generate 500K unique players
;WITH digits (d) AS (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION
   SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0)

INSERT INTO Player (playerName)
SELECT TOP (500000) LEFT(CAST(NEWID() as nvarchar(50)), 20 + (ABS(CHECKSUM(NEWID())) & 15)) as Name
FROM   digits CROSS JOIN digits ii CROSS  JOIN digits iii CROSS  JOIN digits iv CROSS  JOIN digits v CROSS  JOIN digits vi

-- Random score 500K players * 4 games = 2M rows
INSERT INTO [Leaderboard] (
    [gameId],[playerId],[timePeriod],[timePeriodFrom],[score],[createdDateTime])
SELECT  GameID, Player.id,ABS(CHECKSUM(NEWID())) & 3 as [timePeriod], DATEADD(MILLISECOND, CHECKSUM(NEWID()),GETDATE()) as Updated, ABS(CHECKSUM(NEWID())) & 65535 as score
    , DATEADD(MILLISECOND, CHECKSUM(NEWID()),GETDATE()) as Created
FROM (  SELECT 1 as GameID  UNION ALL SELECT 2  UNION ALL SELECT 3  UNION ALL SELECT 4) as Game
    CROSS JOIN Player
ORDER BY NEWID()
UPDATE [Leaderboard] SET [timePeriodFrom]='19000101' WHERE [timePeriod] = 0
GO

DECLARE @From date = '19000101'--'20131108'
    ,@GameID int = 3
    ,@timePeriod tinyint = 0

-- Get paginated ranking 
;With Lb as (
SELECT 
    DENSE_RANK() OVER (ORDER BY Score DESC) as Rnk
    ,Score, createdDateTime, playerId
FROM [Leaderboard]
WHERE GameId = @GameId
  AND [timePeriod] = @timePeriod
  AND [timePeriodFrom] = @From)

SELECT lb.rnk,lb.Score, lb.createdDateTime, lb.playerId, Player.playerName
FROM Lb INNER JOIN Player ON lb.playerId = Player.id
ORDER BY rnk OFFSET 75 ROWS FETCH NEXT 25 ROWS ONLY;

-- Get rank of a player for a given game 
SELECT (SELECT COUNT(DISTINCT rnk.score) 
        FROM [Leaderboard] as rnk 
        WHERE rnk.GameId = @GameId 
            AND rnk.[timePeriod] = @timePeriod
            AND rnk.[timePeriodFrom] = @From
            AND rnk.score >= [Leaderboard].score) as rnk
    ,[Leaderboard].Score, [Leaderboard].createdDateTime, [Leaderboard].playerId, Player.playerName
FROM [Leaderboard]  INNER JOIN Player ON [Leaderboard].playerId = Player.id
where [Leaderboard].GameId = @GameId
    AND [Leaderboard].[timePeriod] = @timePeriod
    AND [Leaderboard].[timePeriodFrom] = @From
    and Player.playerName = N'785DDBBB-3000-4730-B'
GO

这只是一个展示想法的例子。它可以被优化。例如,通过字典表将 GameID、TimePeriod、TimePeriodDate 列合并为一列。索引的有效性会更高。

附言对不起我的英语不好。随时修复语法或拼写错误

关于sql - 使用 SQL Server 设计排行榜,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19921878/

相关文章:

sql - 双点表预选赛

mysql - 基于日期的sql聚合

Mysql只选择行的百分比

php - 显示数据库中的数据而不刷新(Javascript)

sql - 这两个舍入表达式的值不同吗?

java - 如何使用@GetMapping按id加载数据

sql - 需要一些帮助来解决 SQL 分组问题

sql - 为什么没有 0 - 4,294,967,294 的数字数据类型

sql - COUNT(*) 和 COUNT(table.ColumnName) 有什么区别?

java - 如何使用 iciql 模型生成