python - 带有 case 语句的 for 循环-6ren

python - 带有 case 语句的 for 循环

转载作者：行者123 更新时间：2023-12-05 09:35:33

25

4

我正在尝试寻找一种方法来运行 for 循环以更好地针对我的案例语句优化我的脚本。

下面显示的脚本没有错误，但我觉得这太冗长了，可能会在下次维护时造成困惑。

df = df.withColumn('Product', when(df.where('input_file_name LIKE "%CAD%"'), 'Cash and DUE').
                   when(df.where('input_file_name LIKE "%TP%"'), 'Trade Product').
                   when(df.where('input_file_name LIKE "%LNS%"'), 'Corp Loans').
                   when(df.where('input_file_name LIKE "%DBT%"'), 'Debt').
                   when(df.where('input_file_name LIKE "%CRD%"'), 'Retail Cards').
                   when(df.where('input_file_name LIKE "%MTG%"'), 'Mortage').
                   when(df.where('input_file_name LIKE "%OD%"'), 'Overdraft').
                   when(df.where('input_file_name LIKE "%PLN%"'), 'Retail Personal Loan').
                   when(df.where('input_file_name LIKE "%CLN%"'), 'CLN').
                   when(df.where('input_file_name LIKE "%CAT%"'), 'Custody and Trust').
                   when(df.where('input_file_name LIKE "%DEP%"'), 'Deposits').
                   when(df.where('input_file_name LIKE "%STZ%"'), 'Securitization').
                   when(df.where('input_file_name LIKE "%SECZ%"'), 'Security Securitization').
                   when(df.where('input_file_name LIKE "%SEC%"'), 'Securities').
                   when(df.where('input_file_name LIKE "%MTSZ%"'), 'Retail Mortage Securitization').
                   when(df.where('input_file_name LIKE "%PLSZ%"'), 'Retail Personal Loan Securitization').
                   when(df.where('input_file_name LIKE "%CCSZ%"'), 'Retail Cards Securitization').
                   when(df.where('input_file_name LIKE "%CMN%"'), 'Cash Management').
                   when(df.where('input_file_name LIKE "%OTC%"'), 'Over-the-counter').
                   when(df.where('input_file_name LIKE "%SFT%"'), 'Securities Financing Transactions').
                   when(df.where('input_file_name LIKE "%ETD%"'), 'Excahnge Traded Deriative').
                   when(df.where('input_file_name LIKE "%DEF%"'), 'Default Products').
                   when(df.where('input_file_name LIKE "%FFS%"'), 'Not Required').
                   when(df.where('input_file_name LIKE "%hdfs%"'), 'Not Required').
                   otherwise('feed_name'));

我想到了运行一个循环，一个例子如下所示(脚本不正确，它是为了演示目的)

product_code = ['%CAD%','%TP%','%LNS%','%DBT%','%CRD%','%MTG%','%OD%','%PLN%','%CLN%','%CAT%','%DEP%','%STZ%','%SECZ%','%SEC%','%MTSZ%','%PLSZ%','%CCSZ%','%CMN%','%OTC%','%SFT%','%ETD%','%DEF%','%FFS%','%hdfs%']
product_name = ['Cash and Due','Trade Product','Corp Loans','Debt','Retail Cards','Mortage','Overdraft','Retail Personal Loan','CLN','Custody and Trust','Deposits','Securitization','Securities Securitization','Securities','Retail Mortage Securitization','Retail Personal Loan Securitization','Retail Cards Securitization','Cash Management','Over-the-counter','Securities Finanacing Transactions','Exchange Traded Derivative','Default Products','Not Required','Not Required']
   
##Both product_code & product name have the same number of index

lastIndex = len(product_code)    
    for x in product_code:
       # Logic i thought df.withColumn('Product', when(df.where('input_file_name LIKE "%'product_code[x]'%"'), product_name[x])
       if(product_code[lastIndex]): 
      #otherwise('feed_name')

如果在 spark 中可以为 when(df.where()).otherwise 的 case 语句运行循环，或者有另一种方法或用例，我需要一些建议

已更新

我已经按照建议的方法实现了，查询在条件集上返回正确，但我想知道为什么它没有返回正确的值，也就是下面脚本中的 lit()，而不是删除不正确的行满足条件

Sample DF:
product_code = ['%CMN%','%TP%','%LNS%']
product_name = ['Cash and Due','Trade Product']
feed_name = ['farid','arshad','jimmy']   

df = spark.createDataFrame(
     list(zip(inp_file,feed_name)),
     ['input_file_name','feed_name']
)

+---------------+---------+
|input_file_name|feed_name|
+---------------+---------+
|sdasdasdasd    |bob      |
|_CMN_BD        |arshad   |
|_CMN_BD_WS     |jimmy    |
+---------------+---------+

product_code = ['%CAD_%','%TP%','%LNS%','%DBT%','%CRD%','%MTG%','%_OD_%','%PLN%','%CLN%','%CAT%','%DEP%','%STZ%','%SECZ%','%SEC%','%MTSZ%','%PLSZ%','%CCSZ%','%CMN%','%OTC%','%SFT%','%ETD%','%DEF%','%FFS%','%hdfs%']
product_name = ['Cash and Due','Trade Product','Corp Loans','Debt','Retail Cards','Mortage','Overdraft','Retail Personal Loan','CLN','Custody and Trust','Deposits','Securitization','Securities Securitization','Securities','Retail Mortage Securitization','Retail Personal Loan Securitization','Retail Cards Securitization','Cash Management','Over-the-counter','Securities Finanacing Transactions','Exchange Traded Derivative','Default Products','Not Required','Not Required']
   
## -- Create spark dataframe and with list tuple     
## -- Lit is used to add new column

product_ref_df = spark.createDataFrame(
     list(zip(product_code, product_name)),
     ["product_code", "product_name"]
)
    

def tempDF(df,targetField,columnTitle,condition,targetResult,show=False):
    product_ref_df = spark.createDataFrame(
         list(zip(condition,targetResult)),
         ["condition", "target_result"]
    )
    
    df.join(broadcast(product_ref_df), expr(""+targetField+" like condition")) \
    .withColumn(columnTitle, coalesce(col("target_result"), lit("feed_name"))) \
    .drop('condition','target_result') \
    .show()
    
    return df

product_ref_df = tempDF(df,'input_file_name','Product',product_code,product_name)

脚本触发时，没有报错，返回结果如图，

+---------------+---------+------------+
|input_file_name|feed_name|     Product|
+---------------+---------+------------+
|        _CMN_BD|   arshad|Cash and Due|
|     _CMN_BD_WS|    jimmy|Cash and Due|
+---------------+---------+------------+

结果不应该返回第一行，因为我们没有删除任何行，

+---------------+---------+------------+
|input_file_name|feed_name|     Product|
+---------------+---------+------------+
|    sdasdasdasd|      bob|bob         |
|        _CMN_BD|    jimmy|Cash and Due|
|     _CMN_BD_WS|    jimmy|Cash and Due|
+---------------+---------+------------+
+---------------+---------+------------+

最佳答案

您可以从这些产品名称引用创建一个新的 DataFrame 并与原始 df 连接以获取产品名称:

from pyspark.sql.functions import expr, col, broadcast, coalesce

product_ref_df = spark.createDataFrame(
     list(zip(product_code, product_name)),
     ["product_code", "product_name"]
)

df.join(broadcast(product_ref_df), expr("input_file_name like product_code"), "left") \
  .withColumn("Product", coalesce(col("product_name"), col("Feed_name"))) \
  .drop("product_code", "product_name") \
  .show()

或者使用 functools.reduce 来链接 case/when 条件，如下所示:

import functools

from pyspark.sql.functions import lit, col, when

case_conditions = list(zip(product_code, product_name))

product_col = functools.reduce(
    lambda acc, x: acc.when(col(f"input_file_name").like(x[1]), lit(x[1])),
    case_conditions[1:],
    when(col("input_file_name").like(case_conditions[0][0]), lit(case_conditions[0][1]))
).otherwise(col("Feed_name"))

df.withColumn("Product", product_col).show()

关于python - 带有 case 语句的 for 循环，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65786478/

25

4

0

文章推荐： sql-server - 哈希值不匹配

文章推荐： string-comparison - 如何比较 Smalltalk 中的本地化字符串？

java - 想要创建 if 语句，然后是几个 else-if 语句，最后是一个 "capture-all"else-语句
创建一个“海盗对话”，可以选择左手或右手。我希望它对“左”和“右”的不同拼写做出积极的回答(正如您将在代码中看到的那样)，但是，当我为所有非“右”或“左”的输入添加最终的“else”代码时，它给了我一
VBS教程：VBScript 语句-With 语句
With 语句对一个对象执行一系列的语句。 With object statements End With 参数 object 必需的部分
VBS教程：语句-While...Wend 语句
While...Wend 语句当指定的条件为 True 时，执行一系列的语句。 While condition &nbsp； Version [stat
python - 在同一行上创建 for 语句，但不在下一个输入上创建 for 语句
所以我正在处理的代码有一个小问题。 while True: r = input("Line: ") n = r.split() if r == " ":
javascript - 嵌套 if 语句 - 如何重构条件以在迭代时使用一个 if 语句
我有一个对象数组: var contacts = [ { "firstName": "Akira", "lastName": "Laine", "number"
c - 在函数中有两个return 语句，将执行哪个return 语句？
int main() { int f=fun(); ... } int fun() { return 1; return 2; } 在上面的程序中，当从main函数中调用一个
ios - Switch 语句 VS If 语句
我的项目中有很多 if 语句、嵌套 if 语句和 if-else 语句，我正在考虑将它们更改为 switch 语句。其中一些将具有嵌套的 switch 语句。我知道就编译而言，switch 语句通常更
VBS教程：VBScript 语句-Rem 语句
Rem 语句包含程序中的解释性注释。 Rem comment 或 ' comment comment 参数是需要包含的注释文本。在 Rem 关键字和 comment 之间应有一个空格。
VBS教程：VBScript 语句-ReDim 语句
ReDim 语句在过程级中声明动态数组变量并分配或重新分配存储空间。 ReDim [Preserve] varname(subscripts) [, varname(subscripts)]
VBS教程：VBScript 语句-Randomize 语句
Randomize 语句初始化随机数生成器。 Randomize [number] number 参数可以是任何有效的数值表达式。说明 Randomize 使用 number 参数初始
VBS教程：VBScript 语句-Public 语句
Public 语句定义公有变量并分配存储空间。在 Class 块中定义私有变量。 Public varname[([subscripts])][, varname[([subscripts])
VBS教程：VBScript 语句-Sub 语句
Sub 语句声明 Sub 过程的名称、参数以及构成其主体的代码。 [Public [Default]| Private] Sub name [( arglist )]
VBS教程：VBScript 语句-Set 语句
Set 语句将对象引用赋给一个variable或property，或者将对象引用与事件关联。 Set objectvar = {objectexpression | New classname
javascript - 我在一个 for 循环中有两个 if 语句，为什么有时会在第一个语句之前运行第二个 if 语句？
我有这个代码块，有时第一个 if 语句先运行，有时第二个 if 语句先运行。我不确定为什么会这样，因为我认为 javascript 是同步的。 for (let i = 0; i < dataObje
javascript - 为什么这段代码不起作用？ Javascript if 语句，else if 语句
这是一个 javascript 代码，我想把它写成这样:如果此人回答是，则回复“那很酷”，如果此人回答否，则回复“我会让你开心”，如果此人回答的问题包含"is"或“否”，请说“仅键入”是或否，没有任何
java - 短 if 语句 "inside"短 if 语句
这是我的任务，我尝试仅使用简短的 if 语句来完成此任务，我得到的唯一错误是使用“(0.5<=ratio<2 )”，除此之外，构造正确吗？ Scanner scn = new Scanner(
postgresql - SELECT 语句中的 SQL 语句 If 语句
有没有办法在 select 语句中使用 if 语句？我不能在这个中使用 Case 语句。实际上我正在使用 iReport 并且我有一个参数。我想要做的是，如果用户没有输入某个参数，它将选择所有实例。
java - switch 语句 vs if 语句，哪个对性能更好？
这个问题在这里已经有了答案: 关闭 11 年前。 Possible Duplicate: If vs. Switch Speed 我将以 C++ 为例，但我要问的问题不是针对特定语言的。我的意思是一
VBS教程：VBScript 语句-Property Set 语句
Property Set 语句在 Class 块中，声明名称、参数和代码，这些构成了将引用设置到对象的 Property 过程的主体。 [Public | Private] Pro
VBS教程：VBScript 语句-Property Let 语句
Property Let 语句在 Class 块中，声明名称、参数和代码等，它们构成了赋值（设置）的 Property 过程的主体。 [Public | Private] Prop

首页

博学

6Ren·AI

商城

python - 带有 case 语句的 for 循环