sql - Redshift 中选择查询的并发性

标签 sql concurrency amazon-redshift

我们在 Redshift 中有一个表:

people_id    people_tele      people_email        role
1            8989898332       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="4a202522240a2d272b232664292527" rel="noreferrer noopener nofollow">[email protected]</a>      manager
2            8989898333       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="27545342514267404a464e4b0944484a" rel="noreferrer noopener nofollow">[email protected]</a>     manager
3            8989898334       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="93f2fdf7e1f6e4d3f4fef2faffbdf0fcfe" rel="noreferrer noopener nofollow">[email protected]</a>    manager
4            8989898335       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="98fffdf7eafffdd8fff5f9f1f4b6fbf7f5" rel="noreferrer noopener nofollow">[email protected]</a>    manager

我有一些用户会查询该表,例如:

select * from people where role = 'manager' limit 1;

系统用户基本上是调用这些人来推销产品。因此,当查询返回结果时,它不应该返回相同的人。

例如。

如果用户 A 执行查询 - select * from people where role = 'manager' limit 1;,那么他应该得到结果:

people_id    people_tele      people_email        role
1            8989898332       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="96fcf9fef8d6f1fbf7fffab8f5f9fb" rel="noreferrer noopener nofollow">[email protected]</a>      manager

如果用户 B 执行查询 - select * from people where role = 'manager' limit 1;,那么他应该得到结果:

people_id    people_tele      people_email        role
2            8989898333       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e99a9d8c9f8ca98e84888085c78a8684" rel="noreferrer noopener nofollow">[email protected]</a>     manager

方法 1

因此,我想到添加一个 is_processed 列,以免返回相同的结果。因此,在用户 A 执行查询后,该表将如下所示:

people_id    people_tele      people_email        role         is_processed
1            8989898332       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="751f1a1d1b351218141c195b161a18" rel="noreferrer noopener nofollow">[email protected]</a>      manager      1
2            8989898333       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="047770617261446369656d682a676b69" rel="noreferrer noopener nofollow">[email protected]</a>     manager      0
3            8989898334       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="abcac5cfd9cedcebccc6cac2c785c8c4c6" rel="noreferrer noopener nofollow">[email protected]</a>    manager      0
4            8989898335       <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0265676d70656742656f636b6e2c616d6f" rel="noreferrer noopener nofollow">[email protected]</a>    manager      0

方法 2

另一个想法是创建另一个名为 - query_history 的表,其中我有:

query_id   people_id     processed_time
1          1             22 Jan 2020, 4pm
2          2             22 Jan 2020, 5pm

问题

我的问题是,当用户 A 和用户 B 同时查询时会发生什么?此时系统会返回相同的people_id,并且会向同一个人调用2个电话。

如何解决并发问题?

最佳答案

您只需在方法 1 中添加随机数即可解决该问题

SELECT * FROM people 
WHERE role = 'manager' 
AND is_processed = 0
order by random()
limit 1;

引用:https://docs.aws.amazon.com/redshift/latest/dg/r_RANDOM.html

关于sql - Redshift 中选择查询的并发性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59826303/

相关文章:

go - 关闭和发送到 channel 之间的竞争条件

mysql - 用于中型数据的 BigQuery 替代方案

python - 将查询结果附加到 PostgreSQL 中的同一结果行 - Redshift

mysql - 使用 Count 和 GroupBy 返回零

sql - 附加字段实体的最佳数据库结构

sql - PostgreSQL:如何在数据库的每个表中添加一列?

java - 两个ExecutorServices可以共享一个线程池吗?

multithreading - 工作池模式 - 死锁

sql - Listagg Redshift DDL

php - 嵌套查询无法正确获取数据