Error in running Python script in Hadoop-Streaming application -
i running hadoop sample application given in 'hadoop in action'by chuck lam on win 7 notebook on cygwin environment. python installed on cygwin , sample python application running. when run hadoop streaming application throwing following error. following command
"bin/hadoop jar contrib/streaming/hadoop-streaming-1.2.1.jar -d mapred.reduce.tasks=1 -input input/cite75_99.txt -output output -mapper 'randomsample.py 10' -file randomsample.py
randomsample.py
simple application filtering input.
the following error thrown:
java.io.ioexception: cannot run program "c:\cygwin64\home\rajs1\hadoop-1.2.1\.\randomsample.py": createprocess error=193, %1 not valid win32 application @ java.lang.processbuilder.start(processbuilder.java:1041) @ org.apache.hadoop.streaming.pipemapred.configure(pipemapred.java:214) @ org.apache.hadoop.streaming.pipemapper.configure(pipemapper.java:66) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ org.apache.hadoop.util.reflectionutils.setjobconf(reflectionutils.java:88) @ org.apache.hadoop.util.reflectionutils.setconf(reflectionutils.java:64) @ org.apache.hadoop.util.reflectionutils.newinstance(reflectionutils.java:117) @ org.apache.hadoop.mapred.maprunner.configure(maprunner.java:34) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ org.apache.hadoop.util.reflectionutils.setjobconf(reflectionutils.java:88) @ org.apache.hadoop.util.reflectionutils.setconf(reflectionutils.java:64) @ org.apache.hadoop.util.reflectionutils.newinstance(reflectionutils.java:117) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:426) @ org.apache.hadoop.mapred.maptask.run(maptask.java:366) @ org.apache.hadoop.mapred.localjobrunner$job$maptaskrunnable.run(localjobrunner.java:223) @ java.util.concurrent.executors$runnableadapter.call(executors.java:471) @ java.util.concurrent.futuretask$sync.innerrun(futuretask.java:334) @ java.util.concurrent.futuretask.run(futuretask.java:166) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) @ java.lang.thread.run(thread.java:724) caused by: java.io.ioexception: createprocess error=193, %1 not valid win32 application @ java.lang.processimpl.create(native method) @ java.lang.processimpl.<init>(processimpl.java:376) @ java.lang.processimpl.start(processimpl.java:136) @ java.lang.processbuilder.start(processbuilder.java:1022)
and when run following command throws similar error. guess streaming application should execute python application trying execute java application please suggest solution. in advance
you might want try option well
bin/hadoop jar contrib/streaming/hadoop-streaming-1.2.1.jar -d mapred.reduce.tasks=1 -input input/cite75_99.txt -output output -mapper 'python randomsample.py 10' -file randomsample.py
if try this, might not error. might access denied 'python randomsample.py' , try give full path python exe. like
bin/hadoop jar contrib/streaming/hadoop-streaming-1.2.1.jar -d mapred.reduce.tasks=1 -input input/cite75_99.txt -output output -mapper 'c:\mfiles\python randomsample.py 10' -file randomsample.py
good luck
Comments
Post a Comment