Quantcast
Channel: Pentaho Community Forums
Viewing all 16689 articles
Browse latest View live

Replace String

$
0
0
Hi,

I have an input as
Filename
CR12345.TIF
CR113_Activity.TIF
EP45.TIF
EP45_Activty.TIF

I want the output as
NewName
CR12345
CR113
EP45
EP45

I have to remove .TIF and Activity.TIF.

How to do?


-Ramya-

Getting error while loading data in Oracle Db. (Auto commit error)

$
0
0
Please help,

Error Occur during the table output step.

Database connection is getting closed after first commit.


when I first gave 1000 as commit interval, 1000 records has been loaded.


when I increased to 5000, 5000 records were loaded.

Please see the attached log.

snapshot of Log:

2015/06/15 16:25:38 - UK_3 - Connected to database.
2015/06/15 16:25:38 - Load UKIA_2.0 - Connected to database [UK_3] (commit=1000)
2015/06/15 16:25:38 - UK_3 - Auto commit off
2015/06/15 16:25:38 - UKIA_0 - Step [Get rows from result.0] initialized flawlessly.
2015/06/15 16:25:38 - UKIA_0 - Step [Load UKIA_2.0] initialized flawlessly.
2015/06/15 16:25:38 - UKIA_0 - Transformation has allocated 2 threads and 1 rowsets.
2015/06/15 16:25:38 - Get rows from result.0 - Starting to run...
2015/06/15 16:25:38 - Load UKIA_2.0 - Starting to run...
2015/06/15 16:25:38 - Load UKIA_2.0 - Prepared statement : INSERT INTO UKIA_2 (PTCABS, CLIENT, CASENO, ID_NUM, FRAUDDTE, CATEGORY, ADD_INFO, LOADDATE, OPERATOR_ID, CONFIRMED_IND, ARCHIVE_DATE, TRANSACTION_TYPE, VARIABLE_DATA, TITLE_NARR, FIRST_NAME, SECOND_NAME, SURNAME, ORIG_ADDR_LINE_1, ORIG_ADDR_LINE_2, ORIG_ADDR_LINE_3, ORIG_ADDR_LINE_4, ORIG_ADDR_LINE_5, ORIG_POSTCODE, LNK_HOUSE_OCC_COUNT, LINK_PTCABS, LINK_INF_IND, MULTI_ADDR_IND, TIMESTAMP, FRAUD_CATEGORY, EXTRACT_IND, REFILING_IND, COMPANY_NUMBER, COMPANY_NAME, SUB_CAT_GRP_1, SUB_CAT_GRP_2, SUB_CAT_GRP_3, SUB_CAT_GRP_4, SUB_CAT_GRP_5, DATE_OF_BIRTH, PRODUCT_CODE, SECTION_29_FLAG, CASE_TYPE, SUBJECT_ROLE, SUBJECT_ROLE_QUALIFIER, CASEID, NAMEKEYA, NAMEKEYB, NAMEKEYC, NAMEKEYD, NAMEKEYE, NAMEKEYF, NAMEKEYASTD, NAMEKEYCSTD) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
2015/06/15 16:25:38 - UK_3 - Commit on database connection [UK_3]
2015/06/15 16:25:38 - UK_3 - Rollback on database connection [UK_3]
2015/06/15 16:25:38 - UKIA_0 - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : Errors detected!
2015/06/15 16:25:38 - Load UKIA_2.0 - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : Because of an error, this step can't continue:
2015/06/15 16:25:38 - UKIA_0 - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : Errors detected!
2015/06/15 16:25:38 - Get rows from result.0 - Finished processing (I=0, O=0, R=11901, W=11901, U=0, E=0)
2015/06/15 16:25:38 - Load UKIA_2.0 - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : org.pentaho.di.core.exception.KettleException:
2015/06/15 16:25:38 - Load UKIA_2.0 - Error inserting row into table [UKIA_2] with values: [E080], [BOAO0010], [E0845E], [ 603], [], [Y], [CIFS0140], [], [I], [], [], [CHATTA], [], [QASIARMEHMOOD], [THE CORE], [], [COUNTY WAY], [], [BARNSLEY], [S70 2JW], [ 1], [E080], [A], [], [], [], [], [ 0], [], [], [], [], [], [], [UKBA], [], [CFR], [S1A], [AL1], [3237413], [2014/04/28 00:00:00.000], [2014/04/28 00:00:00.000], [2017/04/28 00:00:00.000], [1974/08/13 00:00:00.000], [CHATTAQASIARMEHMOOD], [CHTTQSRMHMD], [CHATTA], [CHTT], [C], [C], [null], [null]
2015/06/15 16:25:38 - Load UKIA_2.0 -
2015/06/15 16:25:38 - Load UKIA_2.0 - Unexpected error inserting row
2015/06/15 16:25:38 - Load UKIA_2.0 - -32233
2015/06/15 16:25:38 - Load UKIA_2.0 -
2015/06/15 16:25:38 - Load UKIA_2.0 -
2015/06/15 16:25:38 - Load UKIA_2.0 - at org.pentaho.di.trans.steps.tableoutput.TableOutput.writeToTable(TableOutput.java:445)
2015/06/15 16:25:38 - Load UKIA_2.0 - at org.pentaho.di.trans.steps.tableoutput.TableOutput.processRow(TableOutput.java:128)
2015/06/15 16:25:38 - Load UKIA_2.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:60)
2015/06/15 16:25:38 - Load UKIA_2.0 - at java.lang.Thread.run(Thread.java:744)
2015/06/15 16:25:38 - Load UKIA_2.0 - Caused by: org.pentaho.di.core.exception.KettleDatabaseException:
2015/06/15 16:25:38 - Load UKIA_2.0 - Unexpected error inserting row
2015/06/15 16:25:38 - Load UKIA_2.0 - -32233
2015/06/15 16:25:38 - Load UKIA_2.0 -
2015/06/15 16:25:38 - Load UKIA_2.0 - at org.pentaho.di.trans.steps.tableoutput.TableOutput.writeToTable(TableOutput.java:341)
2015/06/15 16:25:38 - Load UKIA_2.0 - ... 3 more
2015/06/15 16:25:38 - Load UKIA_2.0 - Caused by: java.lang.ArrayIndexOutOfBoundsException: -32233
2015/06/15 16:25:38 - Load UKIA_2.0 - at oracle.jdbc.driver.OraclePreparedStatement.setupBindBuffers(OraclePreparedStatement.java:2677)
2015/06/15 16:25:38 - Load UKIA_2.0 - at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:9255)
2015/06/15 16:25:38 - Load UKIA_2.0 - at oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:210)
2015/06/15 16:25:38 - Load UKIA_2.0 - at org.pentaho.di.trans.steps.tableoutput.TableOutput.writeToTable(TableOutput.java:315)
2015/06/15 16:25:38 - Load UKIA_2.0 - ... 3 more
2015/06/15 16:25:38 - Load UKIA_2.0 - Signaling 'output done' to 0 output rowsets.
2015/06/15 16:25:38 - UK_3 - Commit on database connection [UK_3]
2015/06/15 16:25:38 - Load UKIA_2.0 - Stopped while putting a row on the buffer
2015/06/15 16:25:38 - Load UKIA_2.0 - Stopped while putting a row on the buffer
2015/06/15 16:25:38 - Load UKIA_2.0 - Stopped while putting a row on the buffer
2015/06/15 16:25:38 - Load UKIA_2.0 - Stopped while putting a row on the buffer
2015/06/15 16:25:38 - Load UKIA_2.0 - Stopped while putting a row on the buffer
Attached Files

spoon.sh not opening ie data integration in gui mode not opening

$
0
0
Hi,
I am receiving following error messages while running ./spoon.sh. I wonder how can i solve it. I am using RHEL operating

Refreshing GOE props...
Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in CLASSPATH?
[KnowledgeFlow] Loading properties and plugins...
[KnowledgeFlow] Initializing KF...
Refreshing GOE props...
Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in CLASSPATH?
[KnowledgeFlow] Loading properties and plugins...
[KnowledgeFlow] Initializing KF...
Refreshing GOE props...
Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not in CLASSPATH?
Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in CLASSPATH?
[KnowledgeFlow] Loading properties and plugins...
[KnowledgeFlow] Initializing KF...
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x000000307860e02c, pid=2152, tid=140645496964864
#
# JRE version: Java(TM) SE Runtime Environment (7.0_60-b19) (build 1.7.0_60-b19)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.60-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [ld-linux-x86-64.so.2+0xe02c]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/pentaho/pdi/design-tools/data-integration/hs_err_pid2152.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
./spoon.sh: line 190: 2152 Aborted (core dumped) "$_PENTAHO_JAVA" $OPT $STARTUP -lib $LIBPATH "${1+$@}"
[pentaho@vertica-srv1 data-integration]$


Looking forward to hear from you.

ujjwal

Create folder structure from data dump

$
0
0
Hi,

I have got a data dump whose folder structure is not in required format. I need to group those folders into one so that i can read files in bulk

Please find the screen shots for input format and required folder structure format
Attached Images

Bug in Pentaho Report Designer 5.3 ???

$
0
0
Hey Team,

Its been a couple of days that we have moved from pentaho 5.1 to pentaho 5.3.

We are generating report's where we have a custom sorting order happening in the mysql query.

When we are grouping the report using the sorted field we are getting the output data in the Ascending order irrespective of the order what we have given. We tried running the same report in pentaho 4.5 and 5.1 PRD. They are working as expected but this thing (5.3) is really going nuts here.

Can someone please help us here if something is wrong from our end.Any help is highly appreciated.

Thanks,
Santosh :cool:

Protect my project

$
0
0
Good morning
I developed a cde dashboard in pentaho, but now I want to hide "my code or files" from the persons who will use my pentaho project in their applications. Do you know how to hide or mask from the users? In other words, how to make my project files only runnable?

Error when running job using Kitchen

$
0
0
Hi, I'm new to this group so I hope I've posted this in the correct place. I'm trying to run a Kettle job using Kitchen from a batch file on windows but I keep getting the error -

INFO 12-06 15:26:47,690 - AROVERLAYGROUPS from Environment : null
INFO 12-06 15:26:48,364 - fipsProviderJsafeJCE installed = false
INFO 12-06 15:26:48,926 - set timer period 60000 milliSec (00:01:00)
INFO 12-06 15:26:48,961 - Connects to APPDAVIRTUAL1DB:11990 through com.bmc.arsys.api.ProxyJRpc@6e35c871
INFO 12-06 15:26:48,974 - Reading pentaho configuration from UDM:Config
ERROR: Kitchen can't continue because the job couldn't be loaded.

The batch job is as follows:-

kitchen.bat /file:"D:\SMT\Integrations\Kettle\FNMP\FNMP.kjb" /server:APPDAVIRTUAL### /port:11990 /user:2487247 /pass:#### /level:Detailed > "D:\SMT\Logs\Kettle\trans.log"

Can anyone help me out please?

CGG Chart doesn't show in PRD (and subsequently on BA server)

$
0
0
Dear all,

I'm trying to create custom dashboard and export some of charts from this dashboard to PDF through PRD report published on server. I was following a guide (http://pentaho-bi-suite.blogspot.cz/...images-to.html), but encountered a problem which I'm not able to solve. I've created dashboard and generated URLs for all reports. When I take one of these URLs and use it in browser, then it displayes generated chart correctly. But when I try to use it in Pentaho Report Designer, then image is not loaded (chart is not visible). When I try to upload this PRD report to server and run it there, then it fails in the same way - picture is not loaded. I was able to find errors like this in PRD logs:

2015-06-16 15:57:40,930 [7773856] WARN - org.pentaho.reporting.engine.classic.core.filter.types.ContentType - Failed to load content using value localhost:8080/pentaho/plugin/cgg/api/services/draw?script=/home/pat/Naposledy2_DataDelivery.js&outputType=svg&userid=admin&password=password
org.pentaho.reporting.libraries.resourceloader.ResourceKeyCreationException: Unable to create key: No loader was able to handle the given key data: localhost:8080/pentaho/plugin/cgg/api/services/draw?script=/home/pat/Naposledy2_DataDelivery.js&outputType=svg&userid=admin&password=password
at org.pentaho.reporting.libraries.resourceloader.DefaultResourceManagerBackend.createKey(DefaultResourceManagerBackend.java:71)

I have tried to generate charts in both formats (png and svg), but the result is always the same - chart is displayed in browser, but not in PRD.

Do I need to set something in PRD? Or copy some libraries somewhere?

I'm using:
- Pentaho BA - 5.3.0.0.213
- CTools - 5.3.0.0.213 (at least I hope so, it's version displayed in marketplace for CDF, CDA, CDE, CGG and Sparkl)
- PRD - 5.0.1

Thank you for any help.

30%accuracy difference between cross-validation and testing with a test set (weka)

$
0
0
I'm new with weka and I have a problem with my text classification project using it.


I have a train dataset with 1000 instances and one of 200 for testing. The problem is that when I try to test the performance of some algorithms (like randomforest), the number given by cross-validation and test set is really different. I though the basic idea between CV and test idea was more or less the same (independent test set (folds) to try the accuracy of the algorithm), so I guessed the accuracy would be similar.


Here is an example of WEKA's log with cross-validation

Quote:

=== Run information ===

Scheme:weka.classifiers.trees.RandomForest -I 100 -K 0 -S 1
Relation: testData-weka.filters.unsupervised.attribute.StringToWordVector-R1-W10000000-prune-rate-1.0-T-I-N0-L-stemmerweka.core.stemmers.IteratedLovinsStemmer-M1-O-tokenizerweka.core.tokenizers.WordTokenizer -delimiters " \r\n\t.,;:\"\'()?!--+-í+*&#$\\/=<>[]_`@"-weka.filters.supervised.attribute.AttributeSelection-Eweka.attributeSelection.InfoGainAttributeEval-Sweka.attributeSelection.Ranker -T 0.0 -N -1
Instances: 1000
Attributes: 276

[list of attributes omitted]
Test mode:10-fold cross-validation

=== Classifier model (full training set) ===

Random forest of 100 trees, each constructed while considering 9 random features.
Out of bag error: 0.269



Time taken to build model: 4.9 seconds

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances 740 74 %
Incorrectly Classified Instances 260 26 %
Kappa statistic 0.5674
Mean absolute error 0.2554
Root mean squared error 0.3552
Relative absolute error 60.623 %
Root relative squared error 77.4053 %
Total Number of Instances 1000

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.479 0.083 0.723 0.479 0.576 0.795 I
0.941 0.352 0.707 0.941 0.808 0.894 E
0.673 0.023 0.889 0.673 0.766 0.964 R
Weighted Avg. 0.74 0.198 0.751 0.74 0.727 0.878

=== Confusion Matrix ===

a b c <-- classified as
149 148 14 | a = I
24 447 4 | b = E
33 37 144 | c = R
72.5% , it's something...

But now if I try with a my test set of 200 instances...

Quote:

=== Run information ===

Scheme:weka.classifiers.trees.RandomForest -I 100 -K 0 -S 1
Relation: testData-weka.filters.unsupervised.attribute.StringToWordVector-R1-W10000000-prune-rate-1.0-T-I-N0-L-stemmerweka.core.stemmers.IteratedLovinsStemmer-M1-O-tokenizerweka.core.tokenizers.WordTokenizer -delimiters " \r\n\t.,;:\"\'()?!--+-í+*&#$\\/=<>[]_`@"-weka.filters.supervised.attribute.AttributeSelection-Eweka.attributeSelection.InfoGainAttributeEval-Sweka.attributeSelection.Ranker -T 0.0 -N -1
Instances: 1000
Attributes: 276

[list of attributes omitted]
Test mode:user supplied test set: size unknown (reading incrementally)

=== Classifier model (full training set) ===

Random forest of 100 trees, each constructed while considering 9 random features.
Out of bag error: 0.269



Time taken to build model: 4.72 seconds

=== Evaluation on test set ===
=== Summary ===

Correctly Classified Instances 86 43 %
Incorrectly Classified Instances 114 57 %
Kappa statistic 0.2061
Mean absolute error 0.3829
Root mean squared error 0.4868
Relative absolute error 84.8628 %
Root relative squared error 99.2642 %
Total Number of Instances 200

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.17 0.071 0.652 0.17 0.27 0.596 I
0.941 0.711 0.312 0.941 0.468 0.796 E
0.377 0 1 0.377 0.548 0.958 R
Weighted Avg. 0.43 0.213 0.671 0.43 0.405 0.758

=== Confusion Matrix ===

a b c <-- classified as
15 73 0 | a = I
3 48 0 | b = E
5 33 23 | c = R


43% ... obviously, something is really wrong, I used batch filtering with test set and with the following commands in command line

Quote:

java -cp "C:\Program Files\Weka-3-6\weka.jar" weka.filters.unsupervised.attribute.StringToWordVector -T -I -O -L -stemmer weka.core.stemmers.IteratedLovinsStemmer -M 1 -tokenizer "weka.core.tokenizers.WordTokenizer -delimiters \" \\r\\n\\t.,;:\\\"\\'()?!-¿¡+*&#$%\\\\/=<>[]_`@\"" -W 10000000 -b -i trainDataSet1000.arff -o trainDataSet1000.vector.arff -r testDataSet.arff -s testDataSet1000.vector.arff


java -cp "C:\Program Files\Weka-3-6\weka.jar" weka.filters.supervised.attribute.AttributeSelection -c 1 -E weka.attributeSelection.InfoGainAttributeEval -S "weka.attributeSelection.Ranker -T 0.0" -b -i trainDataSet1000.vector.arff -o trainDataSet1000.vector.final.arff -r testDataSet1000.vector.arff -s testDataSet1000.vector.final.arff





What am I doing wrong? I manually classified the test and train set using the same criteria, so I find strange that differences.


Any help will be really appreciated


Thanks and excuse my english

SubReport Adding Space -Pentaho Report Designer

$
0
0
All,

I need help on this one. When I create a banded subreport, for some reason a space is being added when I export it to excel. Any ideas as to why? I've attached the report that I'm working on.mainreport.prpt
Attached Files

Mail Validator NoClassDefFoundError

$
0
0
Transformation with Mail Validator step runs fine in Spoon but when called from Pentaho User Console -> Action Sequence -> Job -> Transformation produces this error:

2015-06-16 14:13:32,443 ERROR [org.pentaho.di] Mail Validator - Unexpected error
2015-06-16 14:13:32,443 ERROR [org.pentaho.di] Mail Validator - java.lang.NoClassDefFoundError: org/apache/commons/validator/GenericValidator
at org.pentaho.di.trans.steps.mailvalidator.MailValidation.isRegExValid(MailValidation.java:43)
at org.pentaho.di.trans.steps.mailvalidator.MailValidation.isAddressValid(MailValidation.java:142)
at org.pentaho.di.trans.steps.mailvalidator.MailValidator.processRow(MailValidator.java:157)
at org.pentaho.di.trans.step.RunThread.run(RunThread.java:40)
at java.lang.Thread.run(Thread.java:722)


I assume I need to reference commons-validator-1.3.1.jar somewhere but where? I've tried CLASSPATH, catalina.properties but to no avail.

Environment:
Pentaho Data Integration 4.2.0-stable
Windows Server 2008 R2 Standard
Java(TM) SE Runtime Environment (build 1.7.0_03-b05)

Text File Input - Then and Now

$
0
0
Hi Everyone,

I am working with a former client (it's nice to get follow-on's) who's using the Text File Input step in many, many, transforms.

Incoming zip files are dropped into a processing directory and transforms read from the zip archive - using a form or format that looks like it's not supported (at least in current documentation).

Under the step's File Tab in the 'Selected Files:' list - the File/Directory field contains: zip:file://${FILEPATH} and the Wildcard (regExp) field contains: (\d{4}_)?inkr0\.csv .

The FILEPATH variable is set to the full path and name of the zip file. Ex. C:\Work\Dev\ETL-PROCESSOR\1234_20150527170223_day.zip

I see from the current documentation you can set 'Compression' to read only the first file in the archive. The customer *does not* have that set, and there are many csv files within the zip archive. Behavior when manually running transforms within spoon

Compression Enable if your text file is placed in a Zip or GZip archive.Note: At the moment, only the first file in the archive is read.

is sometimes 'hit or miss'. Sometimes the Show filename(s) button and Show file content buttons will work, and sometimes they don't.

My main question is:
Was there a point where - what the customer's doing was supported and that functionality's now deprecated ?

Or is this 'unsupported' behavior, and has never been a feature of Text File Input .. Is this bad juu-juu and they should stop doing it ?

The reason they are doing this is to save on processing time (speed) by not having to unpack the zip files.

The release they are doing this on is: 4.4.0-stable Build date of: 2012-11-21 16.02.21 . They are planning (near future) to migrate to the latest community version of Kettle.

Thanks in advance,

Resp.,
Irshmun

headless pentaho with hadoop

$
0
0
I'm trying to run pentaho kitchen in a headless server but it seems not to be possible since I need to load big data plugin.
So I inserted in spoon.sh the following in the beggining:
set KETTLE_PLUGIN_PACKAGES=$KETTLE_PLUGIN_PACKAGES,/usr/local/pentaho/data-integration/plugins
export KETTLE_PLUGIN_PACKAGES

Unfortunatly when I do this kitchen and pan will not work in headless server. The following error starts to happen:
Code:

(process:18952): Gtk-WARNING **: Locale not supported by C library.
        Using the fallback 'C' locale.
org.eclipse.swt.SWTError: No more handles [gtk_init_check() failed]
        at org.eclipse.swt.SWT.error(Unknown Source)
        at org.eclipse.swt.widgets.Display.createDisplay(Unknown Source)
        at org.eclipse.swt.widgets.Display.create(Unknown Source)
        at org.eclipse.swt.graphics.Device.<init>(Unknown Source)
        at org.eclipse.swt.widgets.Display.<init>(Unknown Source)
        at org.eclipse.swt.widgets.Display.<init>(Unknown Source)
        at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:611)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92)


Is there any way to use kitchen and pan without depending on gtk?
This would be usefull to have in an headless server, so clients cloud submit there kjb or ktr files to the server after using kettle in their workstations.

I'm trying to find a way to adapt kjb so <type>HadoopTransJobExecutorPlugin</type> in event could be discarded. This way depedency with plugin could be eliminated and so KETTLE_PLUGIN_PACKAGES doesn't need to be defined.

pentaho report connecting to Hive

$
0
0
Hi,

I am using Pentaho Report Designer CE 5.4 version, and I am trying to connect to Hive data source, using below jars:

hive-jdbc-0.10.0.jar
pentaho-hadoop-hive-jdbc-shim-5.0.4.jar

But getting the below error:

Error connecting to database [Hadoop1] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occurred while trying to connect to the database

Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
Could not open connection to jdbc:hive2://kor1046740.apac.bosch.com:10000/default: java.net.ConnectException: Connection refused: connect


org.pentaho.di.core.exception.KettleDatabaseException:
Error occurred while trying to connect to the database

Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
Could not open connection to jdbc:hive2://kor1046740.apac.bosch.com:10000/default: java.net.ConnectException: Connection refused: connect


at org.pentaho.di.core.database.Database.normalConnect(Database.java:428)
at org.pentaho.di.core.database.Database.connect(Database.java:358)
at org.pentaho.di.core.database.Database.connect(Database.java:311)
at org.pentaho.di.core.database.Database.connect(Database.java:301)
at org.pentaho.di.core.database.DatabaseFactory.getConnectionTestReport(DatabaseFactory.java:80)
at org.pentaho.di.core.database.DatabaseMeta.testConnection(DatabaseMeta.java:2686)
at org.pentaho.ui.database.event.DataHandler.testDatabaseConnection(DataHandler.java:546)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke(AbstractXulDomContainer.java:313)
at org.pentaho.ui.xul.swing.tags.SwingButton$OnClickRunnable.run(SwingButton.java:71)
at java.awt.event.InvocationEvent.dispatch(Unknown Source)
at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
at java.awt.EventQueue.access$200(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.WaitDispatchSupport$2.run(Unknown Source)
at java.awt.WaitDispatchSupport$4.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.awt.WaitDispatchSupport.enter(Unknown Source)
at java.awt.Dialog.show(Unknown Source)
at java.awt.Component.show(Unknown Source)
at java.awt.Component.setVisible(Unknown Source)
at java.awt.Window.setVisible(Unknown Source)
at java.awt.Dialog.setVisible(Unknown Source)
at org.pentaho.ui.xul.swing.tags.SwingDialog.show(SwingDialog.java:250)
at org.pentaho.reporting.ui.datasources.jdbc.ui.XulDatabaseDialog.open(XulDatabaseDialog.java:254)
at org.pentaho.reporting.ui.datasources.jdbc.ui.ConnectionPanel$EditDataSourceAction.actionPerformed(ConnectionPanel.java:159)
at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
at javax.swing.DefaultButtonModel.setPressed(Unknown Source)
at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown Source)
at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source)
at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source)
at java.awt.Component.processMouseEvent(Unknown Source)
at javax.swing.JComponent.processMouseEvent(Unknown Source)
at java.awt.Component.processEvent(Unknown Source)
at java.awt.Container.processEvent(Unknown Source)
at java.awt.Component.dispatchEventImpl(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Window.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
at java.awt.EventQueue.access$200(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue$4.run(Unknown Source)
at java.awt.EventQueue$4.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.WaitDispatchSupport$2.run(Unknown Source)
at java.awt.WaitDispatchSupport$4.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.awt.WaitDispatchSupport.enter(Unknown Source)
at java.awt.Dialog.show(Unknown Source)
at java.awt.Component.show(Unknown Source)
at java.awt.Component.setVisible(Unknown Source)
at java.awt.Window.setVisible(Unknown Source)
at java.awt.Dialog.setVisible(Unknown Source)
at org.pentaho.reporting.libraries.designtime.swing.CommonDialog.setVisible(CommonDialog.java:281)
at org.pentaho.reporting.libraries.designtime.swing.CommonDialog.performEdit(CommonDialog.java:193)
at org.pentaho.reporting.ui.datasources.jdbc.ui.JdbcDataSourceDialog.performConfiguration(JdbcDataSourceDialog.java:798)
at org.pentaho.reporting.ui.datasources.jdbc.JdbcDataSourcePlugin.performEdit(JdbcDataSourcePlugin.java:71)
at org.pentaho.reporting.designer.core.actions.report.EditQueryAction.performEdit(EditQueryAction.java:158)
at org.pentaho.reporting.designer.core.editor.structuretree.DataReportTree$EditQueryDoubleClickHandler.mouseClicked(DataReportTree.java:487)
at java.awt.AWTEventMulticaster.mouseClicked(Unknown Source)
at java.awt.AWTEventMulticaster.mouseClicked(Unknown Source)
at java.awt.Component.processMouseEvent(Unknown Source)
at javax.swing.JComponent.processMouseEvent(Unknown Source)
at java.awt.Component.processEvent(Unknown Source)
at java.awt.Container.processEvent(Unknown Source)
at java.awt.Component.dispatchEventImpl(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Window.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
at java.awt.EventQueue.access$200(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue$4.run(Unknown Source)
at java.awt.EventQueue$4.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.run(Unknown Source)
Caused by: org.pentaho.di.core.exception.KettleDatabaseException:
Error connecting to database: (using class org.apache.hive.jdbc.HiveDriver)
Could not open connection to jdbc:hive2://kor1046740.apac.bosch.com:10000/default: java.net.ConnectException: Connection refused: connect

at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:592)
at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:4697)
at org.pentaho.di.core.database.Database.normalConnect(Database.java:414)
... 118 more
Caused by: java.sql.SQLException: Could not open connection to jdbc:hive2://kor1046740.apac.bosch.com:10000/default: java.net.ConnectException: Connection refused: connect
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:164)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:574)
... 120 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:185)
... 125 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 128 more




Can anybody suggest the solution??

Thanks in Advance
Rashmi

Row Normalizer problem

$
0
0
Hi All,
I have to convert columns to rows. So I have used Row Normalizer. I have attached the KTR file.
Every field value is going to Newfield column. But I have to move "Letter" field value to NewField1
such as O/p should be

Field1 Field2 Field3 result Newfield NewField1
1 1 1 issue y
1 1 1 Letter a
2 2 2 issue n
2 2 2 Letter b


Here,Letter value should go to NewFild1 instead of Newfield. I have attached the KTR file. Help me


-Ramya-
Attached Files

PROCEDURE doesn't work in CDE Dashboard

$
0
0
Hi,

i try to edit a CDE Dashboard with a table component. This table have to contain a result of procedure .
I did a many tests and it appears that when i declare a table in my procedure i can't execute it from Pentaho(my procedure works in SQL SERVER)
My procedure looks like this:

ALTER PROCEDURE [dbo].[test]
AS

DECLARE @t TABLE (
a varchar(10)
)
INSERT INTO @t VALUES('aa')
SELECT * FROM @t

have you an idea?
thanks:)

Problems with copying files to Hadoop

$
0
0
Hi,
I've faced with problems when copy csv file from my local machine to Hadoop using Pentaho Kettle.


I have installed follow versions of software:


- Cloudera 5.4 - QuickStart VM with CDH 5.4.x (Virtual Machine installed over Windows 7. I use VMWare Player);
- Hadoop 2.6.0 (/usr/share/cmf/cloudera-navigator-server/libs/cdh5/hadoop-core-2.6.0-mr1-cdh5.4.0.jar);
- Pentaho Kettle (pdi-ce-5.3.0.0-213).


192.168.159.128 - IP addres of Virtual machine (Hadoop installed there)




Pentaho log below:


--------------------------------------------------------------------


2015/06/17 12:53:57 - DBCache - Loading database cache from file: [C:\Users\Admin\.kettle\db.cache-5.3.0.0-213]
2015/06/17 12:53:57 - DBCache - We read 47 cached rows from the database cache!
2015/06/17 12:53:58 - Spoon - Trying to open the last file used.
2015/06/17 12:53:58 - Version checker - OK
2015/06/17 12:54:04 - Spoon - Spoon
2015/06/17 12:54:11 - Spoon - Starting job...
2015/06/17 12:54:14 - hadoop_copy_file - Start of job execution
2015/06/17 12:54:14 - hadoop_copy_file - exec(0, 0, START.0)
2015/06/17 12:54:14 - START - Starting job entry
2015/06/17 12:54:14 - hadoop_copy_file - Starting entry [Hadoop Copy Files]
2015/06/17 12:54:14 - hadoop_copy_file - exec(1, 0, Hadoop Copy Files.0)
2015/06/17 12:54:14 - Hadoop Copy Files - Starting job entry
2015/06/17 12:54:14 - Hadoop Copy Files - Starting ...
2015/06/17 12:54:14 - Hadoop Copy Files - Processing row source File/folder source : [file:///C:/0. Tkachev/0. Projects/6. МТТ/Архитектура/Hadoop/hadoop_input_file.csv] ... destination file/folder : [hdfs://192.168.159.128:8020/user/ktkachev/in]... wildcard : [null]
2015/06/17 12:54:15 - Hadoop Copy Files - file [hdfs://192.168.159.128:8020/user/ktkachev/in\hadoop_input_file.csv] exists!
2015/06/17 12:54:15 - Hadoop Copy Files - File [file:///C:/0. Tkachev/0. Projects/6. МТТ/Архитектура/Hadoop/hadoop_input_file.csv] was overwritten
2015/06/17 12:54:16 - Hadoop Copy Files - ERROR (version 5.3.0.0-213, build 1 from 2015-02-02_12-17-08 by buildguy) : File System Exception: Could not copy "file:///C:/0. Tkachev/0. Projects/6. МТТ/Архитектура/Hadoop/hadoop_input_file.csv" to "hdfs://192.168.159.128:8020/user/ktkachev/in/hadoop_input_file.csv".
2015/06/17 12:54:16 - Hadoop Copy Files - ERROR (version 5.3.0.0-213, build 1 from 2015-02-02_12-17-08 by buildguy) : Caused by: Could not close the output stream for file "hdfs://192.168.159.128:8020/user/ktkachev/in/hadoop_input_file.csv".
2015/06/17 12:54:16 - Hadoop Copy Files - ERROR (version 5.3.0.0-213, build 1 from 2015-02-02_12-17-08 by buildguy) : Caused by: File /user/ktkachev/in/hadoop_input_file.csv could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1541)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3243)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:645)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
2015/06/17 12:54:16 - hadoop_copy_file - Finished job entry [Hadoop Copy Files] (result=[false])
2015/06/17 12:54:16 - hadoop_copy_file - Job execution finished
2015/06/17 12:54:16 - Spoon - Job has ended.


--------------------------------------------------------------------


Could you help me to resolve problem?
Thanks in advance.


Konstantin.

Kettle loses rows

$
0
0
I have this simple transformation:

from_access_to_mysql.jpg

I would like to convert a MS Access DB to a MySQL one.

In the "table out" step I have set 1000 as "commit size" and I have selected "use batch update for inserts". The Access DB has 500.000 records, but I lose 3% of them during the conversion. Sometimes, I get more than 500.000 rows in the MySQL db.

I've tryed changing the commit size and disabling the ""use batch update for inserts" option, but I have the same problem.

Someone may help me?

Thanks in advance.
Attached Images

Oracle Bulk Loader not loading into target table

$
0
0
PDI version 5.3

I'm wondering why I can not get the Oracle Bulk loader to load into target table? It creates the data file and control file, but does not insert into target table?

I'm running this from my local machine. The data and control file are on my desk top. Do I need to access the sqlldr and files from the Oracle server or can this be done in local? If so, why is the sqlldr not loading into table.

sqlldr is pointing to the client/BIN
direct path is selected since I have the files going to my desktop
Oracle client 12.1.0

Also, I get no errors

Thanks,

Azure Sync with Posgres

$
0
0
Hi,

I am attempting to synchronize a Postgres database table, on my local machine, to a Azure database table in the cloud.

I am attempting to Merge Rows (diff), from the two tables, and then Synchronize after Merge.

I get errors talking about the storage type of the 'id' not being the same. How do I get over this?

Or if people have any other suggestions on how I could complete the same job.

I will include some of the errors below.






2015/06/17 17:28:37 - Merge Rows (diff).0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected error
2015/06/17 17:28:37 - Merge Rows (diff).0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : org.pentaho.di.core.exception.KettleException:
2015/06/17 17:28:37 - Merge Rows (diff).0 - Invalid layout detected in input streams, keys and values to merge have to be of identical structure and be in the same place in the rows
2015/06/17 17:28:37 - Merge Rows (diff).0 -
2015/06/17 17:28:37 - Merge Rows (diff).0 - The data type of field #1 is not the same as the first row received: you're mixing rows with different layout. Field [id BigNumber(19)] does not have the same data type as field [id Integer(15)].
2015/06/17 17:28:37 - Merge Rows (diff).0 -
2015/06/17 17:28:37 - Merge Rows (diff).0 -
2015/06/17 17:28:37 - Merge Rows (diff).0 - at org.pentaho.di.trans.steps.mergerows.MergeRows.processRow(MergeRows.java:86)
2015/06/17 17:28:37 - Merge Rows (diff).0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2015/06/17 17:28:37 - Merge Rows (diff).0 - at java.lang.Thread.run(Thread.java:745)
2015/06/17 17:28:37 - Merge Rows (diff).0 - Caused by: org.pentaho.di.core.exception.KettleRowException:
2015/06/17 17:28:37 - Merge Rows (diff).0 - The data type of field #1 is not the same as the first row received: you're mixing rows with different layout. Field [id BigNumber(19)] does not have the same data type as field [id Integer(15)].
2015/06/17 17:28:37 - Merge Rows (diff).0 -
2015/06/17 17:28:37 - Merge Rows (diff).0 - at org.pentaho.di.trans.step.BaseStep.safeModeChecking(BaseStep.java:2061)
2015/06/17 17:28:37 - Merge Rows (diff).0 - at org.pentaho.di.trans.steps.mergerows.MergeRows.checkInputLayoutValid(MergeRows.java:245)
2015/06/17 17:28:37 - Merge Rows (diff).0 - at org.pentaho.di.trans.steps.mergerows.MergeRows.processRow(MergeRows.java:84)
2015/06/17 17:28:37 - Merge Rows (diff).0 - ... 2 more
2015/06/17 17:28:37 - Azure.0 - Finished processing (I=3, O=0, R=0, W=3, U=0, E=0)
2015/06/17 17:28:37 - Merge Rows (diff).0 - Finished processing (I=0, O=0, R=2, W=0, U=0, E=1)
2015/06/17 17:28:37 - address - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Errors detected!
2015/06/17 17:28:37 - Spoon - The transformation has finished!!
2015/06/17 17:28:37 - address - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Errors detected!
2015/06/17 17:28:37 - address - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Errors detected!




Thank-you,

Cameron
Viewing all 16689 articles
Browse latest View live