Using Pentaho MapReduce to Parse Weblog Data - can't start mapreduce with no error
I follow
http://wiki.pentaho.com/display/BAD/...se+Weblog+Data to create a job "weblog_parse_mr.kjb" and a trans "weblog_parse_mapper.ktr".
I use the following command to start this job, but mapreduce can't start and there's no error, just keep on waiting.
/mnt/kettle/data-integration/kitchen.sh -file=/home/hduser/kettle_jobs/ini_test_jobs/weblog_parse_mr_less.kjb -level=Debug
Logs as follow: (there're some Chinese character, I think they're not very important. And my English is not good, I'm sorry.)
--------------------------------------------------------------------------------------------------------------------
[hduser@master data-integration]$ /mnt/kettle/data-integration/kitchen.sh -file=/home/hduser/kettle_jobs/ini_test_jobs/weblog_parse_mr_less.kjb -level=Debug
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
16:25:47,012 INFO [KarafInstance]
*******************************************************************************
*** Karaf Instance Number: 1 at /mnt/kettle/data-integration/./system/karaf ***
*** //data1 ***
*** Karaf Port:8801 ***
*** OSGI Service Port:9050 ***
*******************************************************************************
四月 01, 2016 4:25:48 下午 org.apache.karaf.main.Main$KarafLockCallback lockAquired
信息: Lock acquired. Setting startlevel to 100
2016/04/01 16:25:48 - Kitchen - Logging is at level : 调试
2016/04/01 16:25:48 - Kitchen - Start of run.
2016/04/01 16:25:48 - Kitchen - Allocate new job.
2016/04/01 16:25:48 - Kitchen - Parsing command line options.
2016/04/01 16:25:49 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016-04-01 16:25:52.339:INFO:oejs.Server:jetty-8.1.15.v20140411
2016-04-01 16:25:52.390:INFO:oejs.AbstractConnector:Started NIOSocketConnectorWrapper@0.0.0.0:9050
log4j:ERROR Could not parse url [file:/mnt/kettle/data-integration/./system/osgi/log4j.xml].
java.io.FileNotFoundException: /mnt/kettle/data-integration/./system/osgi/log4j.xml (没有那个文件或目录)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileInputStream.<init>(FileInputStream.java:93)
at sun.net.
http://www.protocol.file.FileURLConn...ction.java:90)
at sun.net.
http://www.protocol.file.FileURLConn...tion.java:188)
at org.apache.log4j.xml.DOMConfigurator$2.parse(DOMConfigurator.java:765)
at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:871)
at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:778)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.log4j.Logger.getLogger(Logger.java:104)
at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:262)
at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:108)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1025)
at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:844)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
at org.springframework.osgi.extender.internal.activator.ContextLoaderListener.<clinit>(ContextLoaderListener.java:253)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at java.lang.Class.newInstance(Class.java:442)
at org.apache.felix.framework.Felix.createBundleActivator(Felix.java:4362)
at org.apache.felix.framework.Felix.activateBundle(Felix.java:2149)
at org.apache.felix.framework.Felix.startBundle(Felix.java:2072)
at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1299)
at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
at java.lang.Thread.run(Thread.java:745)
........ (text I have entered is to long , so I clear this part of logs)
2016/04/01 16:26:02 - weblog_parse_mr_less - 开始执行任务(Begin to execute job)
2016/04/01 16:26:02 - weblog_parse_mr_less - exec(0, 0, START.0)
2016/04/01 16:26:02 - START - Starting job entry
2016/04/01 16:26:02 - weblog_parse_mr_less - 开始项[Pentaho MapReduce - mr]
2016/04/01 16:26:02 - weblog_parse_mr_less - exec(1, 0, Pentaho MapReduce - mr.0)
2016/04/01 16:26:02 - Pentaho MapReduce - mr - Starting job entry
2016/04/01 16:26:02 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:///mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp22/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:///mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp22/lib/client/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:///mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp22/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/mnt/kettle/data-integration/launcher/../lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/mnt/kettle/data-integration/plugins/pentaho-big-data-plugin/lib/slf4j-log4j12-1.7.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See
http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2016/04/01 16:26:03 - weblog_parse_mapper - 为了转换解除补丁开始 [weblog_parse_mapper]
Attempting to load ESAPI.properties via file I/O.
Attempting to load ESAPI.properties as resource file via file I/O.
Not found in 'org.owasp.esapi.resources' directory or file not readable: /mnt/kettle/data-integration/ESAPI.properties
Not found in SystemResource Directory/resourceDirectory: .esapi/ESAPI.properties
Not found in 'user.home' (/home/hduser) directory: /home/hduser/esapi/ESAPI.properties
Loading ESAPI.properties via file I/O failed. Exception was: java.io.FileNotFoundException
Attempting to load ESAPI.properties via the classpath.
SUCCESSFULLY LOADED ESAPI.properties via the CLASSPATH from '/ (root)' using current thread context class loader!
SecurityConfiguration for Validator.ConfigurationFile not found in ESAPI.properties. Using default: validation.properties
Attempting to load validation.properties via file I/O.
Attempting to load validation.properties as resource file via file I/O.
Not found in 'org.owasp.esapi.resources' directory or file not readable: /mnt/kettle/data-integration/validation.properties
Not found in SystemResource Directory/resourceDirectory: .esapi/validation.properties
Not found in 'user.home' (/home/hduser) directory: /home/hduser/esapi/validation.properties
Loading validation.properties via file I/O failed.
Attempting to load validation.properties via the classpath.
validation.properties could not be loaded by any means. fail. Exception was: java.lang.IllegalArgumentException: Failed to load ESAPI.properties as a classloader resource.
SecurityConfiguration for Logger.LogServerIP not either "true" or "false" in ESAPI.properties. Using default: true
2016/04/01 16:26:03 - Pentaho MapReduce - mr - Using org.apache.hadoop.io.Text for the map output value
2016/04/01 16:26:05 - Pentaho MapReduce - mr - Cleaning output path: hdfs://172.16.189.123:9000/user/pdi/weblogs/parse_less
2016/04/01 16:26:05 - Pentaho MapReduce - mr - Using Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
2016/04/01 16:26:05 - Pentaho MapReduce - mr - Configuring Pentaho MapReduce job to use Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
2016/04/01 16:26:05 - Pentaho MapReduce - mr - mapreduce.application.classpath: classes/,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
2016/04/01 16:26:07 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
2016/04/01 16:26:17 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
2016/04/01 16:26:27 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
2016/04/01 16:26:37 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
2016/04/01 16:26:47 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
2016/04/01 16:26:57 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
......
2016/04/01 16:39:27 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
2016/04/01 16:39:37 - weblog_parse_mr_less - Triggering heartbeat signal for weblog_parse_mr_less at every 10 seconds
--------------------------------------------------------------------------------------------------------------------
As above, there's no error, and mapreduce can't start. I don't know what to do.
PS:
I have two hadoop cluster dev-environment.
one dev-environment can run this job successfully.
parts of logs as follow:
------------------------------------------------
......
SecurityConfiguration for Logger.LogServerIP not either "true" or "false" in ESAPI.properties. Using default: true
2016/04/01 16:02:08 - Pentaho MapReduce - mr - Using org.apache.hadoop.io.Text for the map output value
2016/04/01 16:02:09 - Pentaho MapReduce - mr - Cleaning output path: hdfs://192.168.124.129:9000/user/pdi/weblogs/parse_less
2016/04/01 16:02:10 - Pentaho MapReduce - mr - Using Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
2016/04/01 16:02:10 - Pentaho MapReduce - mr - Configuring Pentaho MapReduce job to use Kettle installation from /opt/pentaho/mapreduce/6.0.1.0-386-6.0.1.0-386-hdp22
2016/04/01 16:02:10 - Pentaho MapReduce - mr - mapreduce.application.classpath: classes/,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
2016/04/01 16:02:14 - Pentaho MapReduce - mr - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0
2016/04/01 16:02:24 - Pentaho MapReduce - mr - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0
2016/04/01 16:02:34 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
2016/04/01 16:02:44 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
2016/04/01 16:02:54 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
2016/04/01 16:03:55 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
2016/04/01 16:04:05 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0
2016/04/01 16:04:15 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 44.475254 Reducer Completion: 0.0
2016/04/01 16:04:25 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 61.912037 Reducer Completion: 0.0
2016/04/01 16:04:35 - Pentaho MapReduce - mr - Setup Complete: 100.0 Mapper Completion: 66.66667 Reducer Completion: 0.0
2016/04/01 16:04:35 - Pentaho MapReduce - mr - [SUCCEEDED] -- Task: attempt_1459477876939_0003_m_000001_0 Attempt: attempt_1459477876939_0003_m_000001_0 Event: 0
2016/04/01 16:04:35 - Pentaho MapReduce - mr - Container killed by the ApplicationMaster.
2016/04/01 16:04:35 - Pentaho MapReduce - mr - Container killed on request. Exit code is 143
2016/04/01 16:04:35 - Pentaho MapReduce - mr - Container exited with a non-zero exit code 143
2016/04/01 16:04:35 - Pentaho MapReduce - mr - [SUCCEEDED] -- Task: attempt_1459477876939_0003_m_000000_0 Attempt: attempt_1459477876939_0003_m_000000_0 Event: 1
......
------------------------------------------------
the version of softs I used as follow:
pdi-ce-6.0.1.0-386
hadoop-2.7.1
CentOS 6.4
Thanks for reading my problem.