gpt4 book ai didi

java - SPARK SQL 不存在或不存在

转载 作者:太空宇宙 更新时间:2023-11-04 10:58:36 24 4
gpt4 key购买 nike

我有一个场景,其中有两张表(csv)。为其创建了两个表。当有好的数据时,我可以将其与第二个表中的值(id 统计值)进行映射。如果我有坏数据,我应该再次将其与 id 统计值映射(但值不同)。但是,我无法在 Spark SQL 中使用“不存在”。我收到以下错误:

输入“来自”不匹配,需要 {, 'WHERE', 'GROUP', 'ORDER', 'HAVING', 'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'INTERSECT', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(第 1 行,位置 386)

at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:197)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:99)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:45)
<小时/>

代码:

select
a.ptf_id,a.ptf_code,a.share_id,a.share_code,a.bench_id,a.bench_code
, a.l1_calculation_date,a.l1_begin_date,a.l1_end_date,a.l1_running_date
, a.l1_frequency,a.l1_calculation_step,a.l1_performance_currency
, a.l1_configuration,a.l1_valuation_source,a.l1_nav_valuation_type
, a.l1_setting_reference_type, a.l1_setting_valuation_type
, a.l1_sharpe_ratio_annualized as value,b.id_statistic
from
parquetFile a,
pairRDD b,
stats c
where
a.l1_nav_valuation_type= b.l1_nav_valuation_type
and a.l1_valuation_source = b.l1_valuation_source
and b.l1_Perf = 'l1_sharpe_ratio_annualized'
OR (a.ptf_id not EXISTS (
select e.ptf_id from pairRDD d, parquetFile e
where d.l1_valuation_source = e.l1_valuation_source
AND d.l1_nav_valuation_type = e.l1_nav_valuation_type)
and b.l1_valuation_source ='')

如果我使用“NOT in”,则此查询在 SQL 中有效请帮助我了解在这种情况下除了不存在之外还可以使用哪些其他选项。

最佳答案

查询按照书面形式有点难以理解。目前尚不清楚(无论如何对我来说)您对 stats 表的意图是什么,因为您没有从中选择任何内容,所以我将其删除。显然我自己没有尝试过,但乍一看,你可能会尝试这样的方法:

select
a.PTF_ID,
a.PTF_CODE,
a.SHARE_ID,
a.SHARE_CODE,
a.BENCH_ID,
a.BENCH_CODE,
a.L1_CALCULATION_DATE,
a.L1_BEGIN_DATE,
a.L1_END_DATE,
a.L1_RUNNING_DATE,
a.L1_FREQUENCY,
a.L1_CALCULATION_STEP,
a.L1_PERFORMANCE_CURRENCY,
a.L1_CONFIGURATION,
a.L1_VALUATION_SOURCE,
a.L1_NAV_VALUATION_TYPE,
a.L1_SETTING_REFERENCE_TYPE,
a.L1_SETTING_VALUATION_TYPE,
a.L1_SHARPE_RATIO_ANNUALIZED as VALUE,
b.ID_STATISTIC
from
PARQUETFILE a
inner join
PAIRRDD b
on
a.L1_NAV_VALUATION_TYPE = b.L1_NAV_VALUATION_TYPE and
a.L1_VALUATION_SOURCE = b.L1_VALUATION_SOURCE
left outer join (
select
e.PTF_ID
from
PAIRRDD d
inner join
PARQUETFILE e
on
d.L1_VALUATION_SOURCE = e.L1_VALUATION_SOURCE AND
d.L1_NAV_VALUATION_TYPE = e.L1_NAV_VALUATION_TYPE
where
b.L1_VALUATION_SOURCE = ''
) x
on
a.PTF_ID = x.PTF_ID
where
b.L1_PERF = 'l1_sharpe_ratio_annualized' or
x.PTF_ID is null

关于java - SPARK SQL 不存在或不存在,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47136075/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com