gpt4 book ai didi

sql - 我将如何加入这个统计数据?

转载 作者:行者123 更新时间:2023-11-29 11:50:57 24 4
gpt4 key购买 nike

首先,对问题标题感到抱歉。我不了解统计用语或这种连接困难,无论它是什么。

我有一个查询*,通过它我基本上生成了三个东西.. random_sexrandom_firstrandom_last。我正在尝试加入 this method

 random_sex |   random_first   |   random_last    
------------+------------------+------------------
male | 47.7101715711225 | 24.3833348881337
male | 72.8463141907472 | 28.3560050522089
female | 72.8617294209544 | 33.3203859277759
male | 39.3406164890062 | 26.3352867371729
female | 28.6855500966031 | 65.8870893270099
female | 35.5960198949557 | 83.1188118207422
male | 11.5711074977927 | 10.544433838184
male | 15.6900786811765 | 18.7324617852545
male | 24.9860797089245 | 8.98265511383023
female | 80.4563122882508 | 35.594445341751
(10 rows)

本质上,人口普查数据位于这样的表格中......

    name    | freq  | cumfreq | rank | name_type 
------------+-------+---------+------+-----------
SMITH | 1.006 | 1.006 | 1 | LAST
JOHNSON | 0.81 | 1.816 | 2 | LAST
WILLIAMS | 0.699 | 2.515 | 3 | LAST
JONES | 0.621 | 3.136 | 4 | LAST
BROWN | 0.621 | 3.757 | 5 | LAST
DAVIS | 0.48 | 4.237 | 6 | LAST
MILLER | 0.424 | 4.66 | 7 | LAST
WILSON | 0.339 | 5 | 8 | LAST
MOORE | 0.312 | 5.312 | 9 | LAST
TAYLOR | 0.311 | 5.623 | 10 | LAST
ANDERSON | 0.311 | 5.934 | 11 | LAST
THOMAS | 0.311 | 6.245 | 12 | LAST
JACKSON | 0.31 | 6.554 | 13 | LAST
WHITE | 0.279 | 6.834 | 14 | LAST
HARRIS | 0.275 | 7.109 | 15 | LAST
MARTIN | 0.273 | 7.382 | 16 | LAST
THOMPSON | 0.269 | 7.651 | 17 | LAST
GARCIA | 0.254 | 7.905 | 18 | LAST
MARTINEZ | 0.234 | 8.14 | 19 | LAST

而且,在这种情况下..

 random_sex |   random_first   |    random_last    
male | 47.7101715711225 | 24.3833348881337

我希望它像这样(按程序)加入:

=# select * from census.names where cumfreq > 47.7101715711225 AND name_type = 'MALE_FIRST' order by cumfreq asc limit 1;
name | freq | cumfreq | rank | name_type
--------+-------+---------+------+------------
SILVER | 0.009 | 47.717 | 1424 | MALE_FIRST

=# select * from census.names where cumfreq > 24.3833348881337 AND name_type = 'LAST' order by cumfreq asc limit 1;
name | freq | cumfreq | rank | name_type
--------+-------+---------+------+-----------
HARPER | 0.054 | 24.408 | 185 | LAST

所以这位男士的名字将是 Silver Harper。我一生中从未见过这样的人,但是 they do exist.

我想在上述查询中返回“Silver”“Harper”而不是随机数。我怎样才能让它像这样工作?


脚注

*:为了简单起见:

SELECT
CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
, RANDOM() * 90.020 AS random_first -- dataset is 90% of most popular
, RANDOM() * 90.483 AS random_last
FROM generate_series(1,10,1);

最佳答案

其实我也不懂统计学。但我认为这就是你想要的

让我们命名返回随机列的表 Randoms

WITH RANDOMS AS
(
SELECT
CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
, RANDOM() * 90.020 AS random_first
, RANDOM() * 90.483 AS random_last
FROM generate_series(1,10,1)
)
SELECT (
SELECT A.NAME
FROM census.names A
WHERE A.cumfreq > R.random_first
AND A.name_type = 'MALE_FIRST'
order by A.cumfreq asc limit 1
),
(
SELECT A.NAME
FROM census.names A
WHERE A.cumfreq > R.random_last
AND A.name_type = 'LAST'
order by A.cumfreq asc limit 1
) AS NAME
FROM RANDOMS R ;

关于sql - 我将如何加入这个统计数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8979852/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com