问题描述
大家好,我正在尝试解释此PowerBi语法并将其转换为Pyspark
if(UCS_Incidents[Intensity]="Very High",IF(UCS_Incidents[Severity]="Very High","Red",IF(UCS_Incidents[Severity]="High",IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),if(UCS_Incidents[Intensity]="High",if(UCS_Incidents[Intensity]="Medium","Yellow","Green"))),if(UCS_Incidents[Intensity]="Low","Green",""))))
这就是我尝试过的:
Intensities = df.withColumn(('Intensities',f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Very High'),"Red").
otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'High'),"Red").
otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Medium'),"Orange")
.otherwise('Yellow'))))
.otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Very High'),"Red").
otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'High'),"Orange").
otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Medium'),"Orange")
.otherwise('Yellow'))))
.otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Very High'),"Orange").
otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'High'),"Yellow").
otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Medium'),"Yellow")
.otherwise('Green'))))
.otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Very High'),"Yellow").
otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'High'),"Green").
otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Medium'),"Green")
.otherwise('Green'))))
).otherwise("")
但是,我遇到了这个错误:
A Tuple Object dosen't have an attribute Otherwise
任何帮助将不胜感激,谢谢
解决方法
只是举例说明@jxc的意思: 假设您已经有一个名为 df 的数据框:
from pyspark.sql.functions import expr
Intensities = df.withColumn('Intensities',expr("CASE WHEN Intensity = 'Very High' AND Severity = 'Very High' THEN 'Red' WHEN .... ELSE ... END"))
我把“...”作为占位符,但我认为它使方法清晰。