Java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. spark Eclipse on windows 7

594    Asked by ElizabethClark in Java , Asked on Apr 12, 2021

I'm not able to run a simple spark job in Scala IDE (Maven spark project) installed on Windows 7

Spark core dependency has been added.

val conf = new SparkConf().setAppName("DemoDF").setMaster("local")

val sc = new SparkContext(conf)

val logData = sc.textFile("File.txt")

logData.count()

Error:

16/02/26 18:29:33 INFO SparkContext: Created broadcast 0 from textFile at FrameDemo.scala:13

16/02/26 18:29:34 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path

java.io.IOException: Could not locate executable nullbinwinutils.exe in the Hadoop binaries.

    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)

    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)

    at org.apache.hadoop.util.Shell.(Shell.java:293)

    at org.apache.hadoop.util.StringUtils.(StringUtils.java:76)

    at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)

    at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)

    at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)

    at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)

    at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)

    at scala.Option.map(Option.scala:145)

    at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)

    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)

    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)

    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)

    at scala.Option.getOrElse(Option.scala:120)

    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)

    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)

    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)

    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)

    at scala.Option.getOrElse(Option.scala:120)

    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)

    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)

    at org.apache.spark.rdd.RDD.count(RDD.scala:1143)

    at com.org.SparkDF.FrameDemo$.main(FrameDemo.scala:14)

    at com.org.SparkDF.FrameDemo.main(FrameDemo.scala)

Answered by Fujiwara Ban

Your Answer

Interviews

Parent Categories