当前位置: 代码迷 >> 综合 >> Spark Datafream如何将Column.isin与List使用(判断column中的值是否在List中)--filter(Column.isin(List))
  详细解决方案

Spark Datafream如何将Column.isin与List使用(判断column中的值是否在List中)--filter(Column.isin(List))

热度:99   发布时间:2023-11-03 04:25:55.0

spark datafream 中某列的值进行过滤

val items = List("a", "b", "c")sqlContext.sql("select c1 from table").filter($"c1".isin(items)).collect.foreach(println)

 

直接传入list时报错:

Exception in thread "main" java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(a, b, c) 
at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:49)
at org.apache.spark.sql.functions$.lit(functions.scala:89)
at org.apache.spark.sql.Column$$anonfun$isin$1.apply(Column.scala:642)
at org.apache.spark.sql.Column$$anonfun$isin$1.apply(Column.scala:642)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.sql.Column.isin(Column.scala:642)

 

根据文档,isin采取vararg,而不是列表。List在这里实际上是一个别名。你可以尝试将List转换为vararg,如下所示:

 

val items = List("a", "b", "c")sqlContext.sql("select c1 from table").filter($"c1".isin(items:_*)).collect.foreach(println)

 

  相关解决方案