问题描述
我一直在尝试在本地运行的 zepplin 0.9 上的数据集上运行一些实验。但是,我在对数据集执行操作时遇到了 NPE。相同的操作似乎适用于 Dataframe。这是一个失败的例子
import spark.implicits._
import org.apache.spark.sql.Row
import org.apache.spark.sql.types._
case class Person(firstname: String,middlename: String,lastname: String,id: String,gender: String,salary: Int)
val simpleData = Seq(Row("James","","Smith","36636","M",3000),Row("Michael","Rose","40288",4000),Row("Robert","Williams","42114",Row("Maria","Anne","Jones","39192","F",Row("Jen","Mary","brown",-1)
)
val simpleSchema = StructType(Array(
StructField("firstname",StringType,true),StructField("middlename",StructField("lastname",StructField("id",StructField("gender",StructField("salary",IntegerType,true)
))
val df = spark.createDataFrame(
spark.sparkContext.parallelize(simpleData),simpleSchema).as[Person]
df.filter( x => x.firstname == "James").show()
这是我得到的错误
java.lang.NullPointerException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.sql.catalyst.encoders.OuterScopes$$anonfun$getouterScope$1.apply(OuterScopes.scala:70)
at org.apache.spark.sql.catalyst.expressions.objects.NewInstance$$anonfun$10.apply(objects.scala:485)
at org.apache.spark.sql.catalyst.expressions.objects.NewInstance$$anonfun$10.apply(objects.scala:485)
at scala.Option.map(Option.scala:146)
at org.apache.spark.sql.catalyst.expressions.objects.NewInstance.doGenCode(objects.scala:485)
at org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$genCode$2.apply(Expression.scala:108)
at org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$genCode$2.apply(Expression.scala:105)
at scala.Option.getorElse(Option.scala:121)
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)