python - 有效地生成所有可能的4人团队组合，其中包含130个角色中的特定角色并计算某些值-6ren

python - 有效地生成所有可能的4人团队组合，其中包含130个角色中的特定角色并计算某些值

转载作者：行者123 更新时间：2023-12-02 23:10:49

我在内存的字典中加载了大约130个字符（如游戏角色），每个字典中的值都包含一个角色的特定数据。

这个怎么运作？每个角色有2个聊天室和22个反应。

在一个由4名成员组成的团队中，您要遍历每个角色，抓住他们的两个聊天，然后遍历其他三个角色的反应，并求和其反应的值并重复。
完成此操作后，获取两个最高值（无法重复聊天），然后将两个值总和为最终值。

尝试“伪代码”：

results = []

for character in team
    for chat in chats_of_character
        chat_morale = 0
        for remaining_character in team 
            if remaining_character is not character
                grab from remaining_character reactions values the value of chat and sum it 
                to chat_morale
                add (chat_morale, character, chat) to results as a tuple

sort results list by the first value of each tuple (chat_morale)
create a new list that removes duplicates based on the third item of every tuple in results
grab the two first (which would be the highest chat_morale out of them all) and sum both 
chat_morale and return the result and total_morale

或我当前使用的当前代码：（我省略了按相反顺序按每个元组的第一个值对结果进行排序的部分，如果它们的选项值相同，则删除元组，并根据第一个获取两个最高的元组值。如果需要，我将添加这些部分。）

def _camping(self, heroes):
    results = []

    for hero, data in heroes.items():
        camping_data = data['camping']

        for option in camping_data['options']:
            morale = sum(heroes[hero_2]['camping']['reactions'][option] 
                         for hero_2 in heroes if hero_2 != hero)
            results.append( (morale, hero, option) )

字符值之一的简短示例：

"camping": {
    "options": [
        "comforting-cheer",
        "myth"
    ],
    "reactions": {
        "advice": 8,
        "belief": 0,
        "bizarre-story": 1,
        "comforting-cheer": 6,
     ...

因此，我要构建的是一个高效且快速的系统，该系统可以根据用户输入的字符为团队检索剩余的X个最佳成员。如果用户输入2个字符，则根据某些特定于字符的值的计算，我们将返回剩下的两个最合适的字符，如果用户输入3个字符，则仅一个成员。

就我而言，效率是必需的，因为我想为Discord机器人向用户提供快速响应。

因此，我提出了两种不同的尝试来解决此问题：

尝试1：即时计算

    all_heroes = self.game_data.get_all_heroes()

    # Generate all possible combinations.
    for combination in itertools.combinations(h.keys(), r=4):
        # We want only combinations that contains for example the character 'Achates'.
        if set(['achates']).issubset(combination):
            # We grab the character data from the all heroes list to pass it to _camping.
            hero_data = {hero: all_heroes[hero] for hero in combination}
            self._camping(hero_data)

仅执行组合大约需要6秒钟（大约1300万种组合），并且根据固定字符的数量（在上面的代码示例中，它只是“ Achates”）大约还需要3到6秒钟。这通常会导致运行时间超过10秒，这是一个问题，因为我希望此功能能被大量使用。

这个系统的缺点是我必须全部计算它们。

尝试2：预先计算所有可能的团队组合及其总体士气，并将其存储在数据库中

到目前为止，这是我最接近解决此问题的方法。我生成了每种可能的团队组合（大约11-13百万），计算了他们的总士气，并将他们和团队的总士气存储在数据库中。计算所有内容并插入数据将花费一个多小时，但这不是问题，因为这是一次性的事情，而且如果有新字符，则插入记录的方式会更少。

使用索引时，如果查询仅包含一个字符，则按大约50-60ms的时间即可获取所有团队，并按总士气排序，如果团队包含2个或3个字符，则将其限制为50个甚至更少的时间。

这种尝试的问题在于如何将数据存储在列中，这对我来说是一个巨大的疏忽。尽管团队秩序不会影响整体士气结果，但这是itertools.combinations生成的。

在第一个查询中，我想尝试的是寻找一个同时包含Cidd和Tenebria的团队，另外剩下的两个最好的成员是Watcher Schuri和Yufine，总共有34名士气。但这是不正确的结果，如第二个查询所证明。有一个同时包含Cidd和Tenebria的团队，他们的士气更高，为48，但是由于Tenebria在第四列，因此先前的查询无法捕获它。

编辑1：我试图生成查询的所有可能的条件，但仍然导致查询缓慢。

尝试3-使用@bimsapi方法

这是我今天早些时候尝试过的方法，但我逐步按照他的回答再次尝试。我最终得到了这样的模式：

                               Table "public.campingcombinations"
    Column    |  Type   | Collation | Nullable |                     Default
--------------+---------+-----------+----------+-------------------------------------------------
 id           | bigint  |           | not null | nextval('campingcombinations_id_seq'::regclass)
 team         | text[]  |           |          |
 total_morale | integer |           |          |
Indexes:
    "idx_team" gin (team)

桌子看起来像这样：

yufinebotdev=# SELECT * FROM CampingCombinations LIMIT 5;
   id   |                  team                  | total_morale
--------+----------------------------------------+--------------
 100001 | {achates,adlay,aither,alexa}           |           26
 100002 | {achates,adlay,aither,angelica}        |           24
 100003 | {achates,adlay,aither,aramintha}       |           25
 100004 | {achates,adlay,aither,arbiter-vildred} |           23
 100005 | {achates,adlay,aither,armin}           |           24

可悲的是给了我各种各样的结果。第一次查询将花费一秒钟，但这取决于字符，查询计划将是相同的。使用一个示例：Achates。

yufinebotdev=# EXPLAIN ANALYZE SELECT * FROM CampingCombinations WHERE team @> ARRAY['achates'] ORDER BY total_morale DESC LIMIT 50;
                                                                               QUERY PLAN                                                                    
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=188770.50..188776.33 rows=50 width=89) (actual time=1291.841..1302.641 rows=50 loops=1)
   ->  Gather Merge  (cost=188770.50..221774.07 rows=282868 width=89) (actual time=1291.839..1302.633 rows=50 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=187770.47..188124.06 rows=141434 width=89) (actual time=1183.865..1183.868 rows=34 loops=3)
               Sort Key: total_morale DESC
               Sort Method: top-N heapsort  Memory: 35kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 35kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 35kB
               ->  Parallel Bitmap Heap Scan on campingcombinations  (cost=3146.68..183072.14 rows=141434 width=89) (actual time=119.376..1152.543 rows=119253 loops=3)
                     Recheck Cond: (team @> '{achates}'::text[])
                     Heap Blocks: exact=1860
                     ->  Bitmap Index Scan on idx_team  (cost=0.00..3061.82 rows=339442 width=0) (actual time=213.798..213.798 rows=357760 loops=1)
                           Index Cond: (team @> '{achates}'::text[])
 Planning Time: 11.893 ms
 Execution Time: 1302.707 ms
(16 rows)

第二个查询计划与之完全相同，大约需要135ms。但是，我尝试使用另一个角色“ Serila”进行同样的操作。

yufinebotdev=# EXPLAIN ANALYZE SELECT * FROM CampingCombinations WHERE team @> ARRAY['serila'] ORDER BY total_morale DESC LIMIT 50;
                                                                               QUERY PLAN                                                                    
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=188066.24..188072.07 rows=50 width=89) (actual time=30684.587..30746.121 rows=50 loops=1)
   ->  Gather Merge  (cost=188066.24..224336.01 rows=310862 width=89) (actual time=30684.585..30746.110 rows=50 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=187066.22..187454.79 rows=155431 width=89) (actual time=30369.531..30369.535 rows=37 loops=3)
               Sort Key: total_morale DESC
               Sort Method: top-N heapsort  Memory: 36kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 35kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 36kB
               ->  Parallel Bitmap Heap Scan on campingcombinations  (cost=3455.02..181902.91 rows=155431 width=89) (actual time=519.121..30273.208 rows=119253 loops=3)
                     Recheck Cond: (team @> '{serila}'::text[])
                     Heap Blocks: exact=47394
                     ->  Bitmap Index Scan on idx_team  (cost=0.00..3361.76 rows=373035 width=0) (actual time=771.046..771.046 rows=357760 loops=1)
                           Index Cond: (team @> '{serila}'::text[])
 Planning Time: 7.315 ms
 Execution Time: 30746.199 ms
(16 rows)

30秒...但是我想也许下面的查询会更快？否，大约每个查询28到30秒。尽管我无法对其进行彻底的测试，但看起来字符“越远”查询越慢。

例如，以“ A”或“ B”“开始”的字符进行第一次查询需要1秒，而随后的查询则需要90-100ms。但是我尝试使用像Serila这样的S字符，它每次查询的射击时间最多为15秒，一个以T开头的字符，每个查询大约18秒，或者以M开头的字符，第一个查询为7秒，随后的查询大约为900ms-1秒。

尝试4-与上述相同，但使用varchar []列

我使用的是 INSERT而不是每个值仅 COPY，它可以大大减少将值添加到表中所花费的时间，我不太确定这是否会影响任何事情，但我会提及。另一个要提到的是，我切换到运行1个vCPU和25GB SSD以及1GB RAM的服务器。

当前架构如下：

                                     Table "public.campingcombinations"
    Column    |        Type         | Collation | Nullable |                     Default
--------------+---------------------+-----------+----------+-------------------------------------------------
 id           | bigint              |           | not null | nextval('campingcombinations_id_seq'::regclass)
 team         | character varying[] |           |          |
 total_morale | integer             |           |          |
Indexes:
    "idx_camping_team" gin (team)
    "idx_camping_team_total_morale" btree (total_morale DESC)

再次产生了不同的结果。某些单个字符查询在第一次被查询时最多花费约10ms，而另一些查询在第一次查询时则花费将近2秒，而随后的查询则取决于字符，大约需要10ms vs 2秒。

EXPLAIN ANALYZE SELECT * FROM CampingCombinations WHERE team @> ARRAY['yufine']::varchar[] ORDER BY total_morale DESC LIMIT 5;

    QUERY PLAN                                                                      
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.44..17.03 rows=5 width=89) (actual time=2.155..2.245 rows=5 loops=1)
   ->  Index Scan using idx_camping_team_total_morale on campingcombinations  (cost=0.44..2142495.49 rows=645575 width=89) (actual time=2.153..2.242 rows=5 loops=1)
         Filter: (team @> '{yufine}'::character varying[])
         Rows Removed by Filter: 2468
 Planning time: 2.241 ms
 Execution time: 2.274 ms
(6 rows)

这是查询之间保持一致的情况之一。但是，无论我运行查询多少次，这都是需要几秒钟的时间。

EXPLAIN ANALYZE SELECT * FROM CampingCombinations WHERE team @> ARRAY['tieria']::varchar[] ORDER BY total_morale DESC LIMIT 5;

    QUERY PLAN                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.44..17.21 rows=5 width=89) (actual time=4396.876..8626.916 rows=5 loops=1)
   ->  Index Scan using idx_camping_team_total_morale on campingcombinations  (cost=0.44..2142495.49 rows=638566 width=89) (actual time=4396.875..8626.906 rows=5 loops=1)
         Filter: (team @> '{tieria}'::character varying[])
         Rows Removed by Filter: 129428
 Planning time: 0.160 ms
 Execution time: 8626.951 ms
(6 rows)

第二个查询将具有类似的结果。计划时间为3.879毫秒，执行时间为6945.253毫秒。不管我运行多少次。由于某种原因，该角色似乎有些特定，尚未在其他特定角色上找到。如果我尝试使用具有该角色的2人团队，情况也会相同。

EXPLAIN ANALYZE SELECT * FROM CampingCombinations WHERE team @> ARRAY['yufine', 'tieria']::varchar[] ORDER BY total_morale DESC LIMIT 5;

    QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..874.29 rows=5 width=89) (actual time=24752.449..39808.550 rows=5 loops=1)
   ->  Index Scan using idx_camping_team_total_morale on campingcombinations  (cost=0.43..1937501.21 rows=11086 width=89) (actual time=24752.444..39808.535 rows=5 loops=1)
         Filter: (team @> '{yufine,tieria}'::character varying[])
         Rows Removed by Filter: 439703
 Planning time: 0.215 ms
 Execution time: 39809.799 ms
(6 rows)

随后，两个人的团队将或多或少地花费几乎相同的时间。现在，由3人组成的团队似乎可以很好地配合该角色。 50-60ms。

我还发现，无论我查询多少次，一个由2人组成的团队都要花费近1分钟的时间，但是单独查询两个角色根本没有0个问题。

EXPLAIN ANALYZE SELECT * FROM CampingCombinations WHERE team @> ARRAY['purrgis', 'angelica']::varchar[] ORDER BY total_morale DESC LIMIT 5;
                                                                                 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..821.41 rows=5 width=89) (actual time=33491.860..51059.420 rows=5 loops=1)
   ->  Index Scan using idx_camping_team_total_morale on campingcombinations  (cost=0.43..1937501.21 rows=11800 width=89) (actual time=33491.857..51059.409 rows=5 loops=1)
         Filter: (team @> '{purrgis,angelica}'::character varying[])
         Rows Removed by Filter: 595184
 Planning time: 0.139 ms
 Execution time: 51060.318 ms

但是然后两个字符分别〜2ms。

我的问题是：在考虑性能并获得适当结果的同时，是否可以进行第二次尝试？或者，如果不可能的话，对此功能有更好的选择吗？

最佳答案

预计算是一个很好的优化；为了更好地处理列布局，我建议使用PostgreSQL数组列来存储团队成员。

您可以在一列中存储任意数量的名称
“包含”运算符@>与顺序无关。即，您得到相同的
如果输入为['bar'，'foo']，则结果为['foo'，'bar']
您可以为该列编制索引以加快搜索速度，但是必须使用gin类型
您可以扩展到其他团队规模而无需大幅更改架构。

在您的SQL / DDL中：

#simplified table definition:
create table campingcombinations (
    id bigserial,
    members text[],
    morale int
);

create index idx_members on campingcombinations using gin ('members');

在您的Python中：

# on insert
for team in itertools.combinations(source_list, r=4):
    team = [normalize(name) for name in team] #lower(), strip(), whatever
    morale = some_function() #sum, scale, whatever
    stmt.execute('insert into campingcombinations (members, morale) values (%s, %s)', (team, morale,))

# on select
stmt.execute('select * from campingcombinations where members @> %s order by morale desc', (team,))
for row in stmt.fetchall():
    #do something

在大多数情况下， psycopg2驱动程序可处理类型转换，但有一个陷阱：根据定义数组的方式，可能需要强制转换。对于
例如，我将列定义为 members varchar[]，因此“包含”
子句需要强制转换，例如： where members @> %s::varchar[]。通过
默认情况下，输入数组将被视为 text[]。如果将列定义为 text[]，则应该没有问题。

关于python - 有效地生成所有可能的4人团队组合，其中包含130个角色中的特定角色并计算某些值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55555908/

文章推荐： powershell - 在PowerShell中使用动态参数值

文章推荐： elasticsearch - 无法设置index.mapping.single_type:对索引为true

文章推荐： realm - Realm 表有大小限制吗？

组合/值的mysql分布
我有一个 mysql 表，其中包含一些随机数字组合。为简单起见，以下表为例: index|n1|n2|n3 1 1 2 3 2 4 10 32 3 3 10 4 4
SQL - 组合 AND & OR
我有以下代码: SELECT sdd.sd_doc_classification, sdd.sd_title, sdd.sd_desc, sdr.sd_upl
组合 2 个数据帧时按日期重复变量
如果我有两个要合并的数据框 Date RollingSTD 01/06/2012 0.16 01/07/2012 0.18 01/08/2012 0.17 01/09/20
clojure - 没有码头的环/组合
我知道可以使用 lein ring war 创建一个 war 文件，但它似乎仍然包含码头依赖项。当我构建 war (并在 tomcat 上部署)时，有没有办法排除码头依赖项？如果我根本不能做这件事，
clone - 封装聚合/组合
维基百科关于封装的文章指出: “封装还通过防止用户将组件的内部数据设置为无效或不一致的状态来保护组件的完整性” 我在一个论坛上开始讨论封装，在那里我问你是否应该始终在 setter 和/或 gette
带有复选框的 ExtJS 组合
对于我使用的组合框内的复选框: AOEDComboAssociationName = new Ext.form.ComboBox({ id: 'AOEDComboAssociationName',
c# - 组合 Where 语句的表达式
这个问题在这里已经有了答案: 关闭 10 年前。 Possible Duplicate: How do I combine LINQ expressions into one? public boo
rust - 组合/排列的数量
如何在 rust 中找到排列或组合的数量？例如C(10,6) = 210 我在标准库中找不到这个函数，也找不到那里的阶乘运算符(这就足够了)。最佳答案以@vallentin 的回答为基础，可以进
泛型类型的 Scala 组合
我有一个复杂的泛型类型用例，已在下面进行了简化 trait A class AB extends A{ val v = 10 } trait X[T<:A]{ def request: T }
Hibernate 标准限制 AND/OR 组合
如何使用 Hibernate 限制来实现此目的？ (((A='X') and (B in('X',Y))) or ((A='Y') and (B='Z'))) 最佳答案思考有效 Criteria c
javascript - 在谷歌条形图上绘制直线(组合)
我一定会在我的一个项目中使用谷歌图表。我需要的是，显示一个条形图，并且在条形图中，与每个条形相交的线代表另一个值。如果您查看下面的 jsfiddle，您会发现折线图仅与中间的条形图相交，并继续向其他条
javascript - 组合/匹配数组
只是一个简单的问题，我也很想得到答案，因为我不能百分百理解 Javascript 示例:假设您提示用户输入名称。够简单吧？但是你有一个数组，上面写着一些名字(其中之一就是)，基本上就是我到目前为止所说
具有两个参数的 Haskell 组合
我试图通过 Haskell 理解函数式编程，但在处理函数组合时遇到了很多麻烦。其实我有这两个功能: add:: Integer -> Integer -> Integer add x y = x
Realm :组合 "or"和 "and"
我正在寻找一种在 Realm 查询中组合 AND 和 OR 的方法。这是我的课: class Event extends RealmObject { String id; String
Ruby - 哈希 - 组合
例如，我有一个包含 5 个元素的哈希: my_hash = {a: 'qwe', b: 'zcx', c: 'dss', d: 'ccc', e: 'www' } 我的目标是每次循环哈希时都返回，但没
ios - 组合:以一定的延迟发布序列的元素
我是Combine 的新手，我想得到一个看似简单的东西。假设我有一个整数集合，例如: let myCollection = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 我想以例如 0
java - 组合、转发和包装
关于“优先组合而不是继承”的问题，我的老师是这样说的: 组合:现有类成为新类的组件转发:新类中的每个实例方法，在现有类的包含实例上调用相应的方法并返回结果包装器:新类封装了现有的这三个概念我不是
java - 组合 if-then 语句
我正在尝试将单个整数从 ASCII 值转换为 0 和 1。相关代码如下所示: int num1 = bin.charAt(0); int num2 = bin.charAt(1);
java - 组合:如何使用点表示法访问非静态变量而不出现空指针异常？
这个问题已经有答案了: What is a NullPointerException, and how do I fix it? (12 个回答) 已关闭 7 年前。我经常看到“嵌套”类中的非静态变
python - 组合/合并具有重复名称的两个数据集
我尝试合并两个数据集(DataFrame)，如下所示: D1 = pd.DataFrame({'Village':['Ampil','Ampil','Ampil','Bachey','Bachey',

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 有效地生成所有可能的4人团队组合，其中包含130个角色中的特定角色并计算某些值