Quantcast
Channel: Pentaho Community Forums
Viewing all 16689 articles
Browse latest View live

how to add column with constant number to each file

$
0
0
Hello
I'd like it if you could help me,
I have a step that reads over 100 text files,
each file contains about 15 lines
I'd like to put a column with a constant number to identify each file,
all this I put in a table

example

file 1
column 1 column 2
xxxxxx ---- 1
xxxxxx ---- 1
xxxxxx ---- 1
xxxxxx ---- 1

file 2
xxxxxx ---- 2
xxxxxx ---- 2
xxxxxx ---- 2

file3
xxxxxx ---- 3
xxxxxx ---- 3
xxxxxx ---- 3

with row number by file does not work because the number is sequential, not independent for each file

I'm sorry about my English.
Thank you very much.

text file imput ------> select values ----------> table output

xml file Performance issue

$
0
0
Hi All,


i have to generate many xml files.
Infact, need to generate about a million xml files ,one for each member.
I am looping member ids in a job to generate xml and generating the files.
It caused heap size error "java.lang.OutOfMemoryError: Java heap space".
So i increased the setting in spoon.sh OPT =-Xmx1024m . But it didnt work out.
So i tried to run it in a job executer step.
Now the job is working fine.
But it is taking one hour to generate 1000 xml file.


Is there any way to improve the performance ?
Each file will take about 2-3 seconds to generate.
My transformation have table input steps,add xml ,join xml, replace in string,text file out put steps.
Any help would be great!!!


Thanks,
Vineetha P

Modified Java Script Value - fireToDB - Database connection not found

$
0
0
Hi

I have used this in a previous version (4 something). However, my company updated to 6.0.1.0-386 and now I am getting this:

Code:

var strConn = "testOra";
var strInSQL = 'select count(1) from arschema';
var InxArr = fireToDB(strConn, strInSQL);
Alert(InxArr[0][0]);

When I click Test Script or Run it, I get the following:

Code:

2018/12/14 14:44:26 - Modified Java Script Value.0 - ERROR (version 6.0.1.0-386, build 1 from 2017-11-24 23.33.08 by buildadm) : Unexpected error
2018/12/14 14:44:26 - Modified Java Script Value.0 - ERROR (version 6.0.1.0-386, build 1 from 2017-11-24 23.33.08 by buildadm) : org.pentaho.di.core.exception.KettleValueException:
2018/12/14 14:44:26 - Modified Java Script Value.0 - Javascript error:
2018/12/14 14:44:26 - Modified Java Script Value.0 - org.mozilla.javascript.EvaluatorException: Database connection not found: testOra (script#5) (script#5)
2018/12/14 14:44:26 - Modified Java Script Value.0 -
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.pentaho.di.trans.steps.scriptvalues_mod.ScriptValuesMod.addValues(ScriptValuesMod.java:474)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.pentaho.di.trans.steps.scriptvalues_mod.ScriptValuesMod.processRow(ScriptValuesMod.java:540)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at java.lang.Thread.run(Unknown Source)
2018/12/14 14:44:26 - Modified Java Script Value.0 - Caused by: org.mozilla.javascript.EvaluatorException: org.mozilla.javascript.EvaluatorException: Database connection not found: testOra (script#5) (script#5)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.DefaultErrorReporter.runtimeError(DefaultErrorReporter.java:109)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.Context.reportRuntimeError(Context.java:945)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.Context.reportRuntimeError(Context.java:1001)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.pentaho.di.trans.steps.scriptvalues_mod.ScriptValuesAddedFunctions.fireToDB(ScriptValuesAddedFunctions.java:540)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at java.lang.reflect.Method.invoke(Unknown Source)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:161)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.FunctionObject.call(FunctionObject.java:413)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.optimizer.OptRuntime.callName(OptRuntime.java:97)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.gen.script_34._c_script_0(script:5)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.gen.script_34.call(script)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:426)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3178)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.gen.script_34.call(script)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.mozilla.javascript.gen.script_34.exec(script)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    at org.pentaho.di.trans.steps.scriptvalues_mod.ScriptValuesMod.addValues(ScriptValuesMod.java:387)
2018/12/14 14:44:26 - Modified Java Script Value.0 -    ... 3 more
2018/12/14 14:44:26 - Modified Java Script Value - PREVIEW - Transformation detected one or more steps with errors.
2018/12/14 14:44:26 - Modified Java Script Value - PREVIEW - Transformation is killing the other steps!

The testOra connection is working fine in a table input. I have created a second test connection and the same error appears.

I've searched the Jira and couldn't find anything. Any ideas please?

Thanks
Danny

How to install a complete 8.2 suite?

$
0
0
I'm managing to install a 8.2 BI Suite CE on a SQL Server DBMS, but I don't know what I have exactly to download.
I need to build a DW, then I'm going to download PDI https://sourceforge.net/projects/pen...2.zip/download, where store historical and analytical data.
Then I have to build up some cube, then I need of Mondrian I think, but I can't find it on sourceforge. Could be that it's integrated in Pentaho Server CE? https://sourceforge.net/projects/pen...%208.2/server/
At the end I need to build up some reports and I think that the piece needed is PRD https://sourceforge.net/projects/pen...2.zip/download

I'm correct or I'm forgetting something? Do you think that I have all of needed to present some reports based on extraction of some data organized in some cube?

Thank you very much

load binary data to table

$
0
0
I'm trying to load 70,000 picture files to a database table in postresql. I found this page https://wiki.pentaho.com/display/COM...ts+With+Kettle
but get an error in the javascript step:

Cannot call property createByteArray in object [JavaPackage be.ibridge.kettle.core.Const]. It is not a function, it is "object". (script#4)

Since I am not a java coder, I have no idea how to solve this. Anyone know what I need to change?

Pentaho v8 with CAS 8 - Help - do you have it installed?

$
0
0
We have just upgraded to Pentaho v8 and want to install it with CAS v5. In reading the Pentaho 8 with CAS install documentation (link below), they recommend CAS 3.1.10. I have an open ticket with Pentaho questioning this and I have still been told they have only tested CAS 3.1.10 but he recommend trying CAS 3.5.2. Still an old version.

Has anyone successfully installed CAS 5 with Pentaho v8? We also use Tomcat 8 and Java 8.

https://help.pentaho.com/Documentati...curity/060/000
Step 2 in the document notes to download old versions of CAS.
cas-client-core-3.1.10.jar

Calculation when the job runs in parallel threads

$
0
0
Hi

I do have a transformation T1 which executes a Job J1 using the job executor step .

one of the transformation step gives a result set (List of filenames) to job executor and the job downloads them

Job executor no of copies has been set to 10 to have multiple threads to do the parallel downloading

The functionality works well.. But i need to have a logic in place to count the total files downloaded by all the threads .. what is the best way to achieve this ? some cases , some threads may not download files (0 rows passed from the input transformation)

Changes in License for PDI community edition after bought by Hitachi ?

$
0
0
Hi, i don't know if anyone have already asked this question. I am just wondering if there is any changes on the Pentaho DI community edition license after being bought by Hitachi. The reason that i am asking that we are utilizing PDI community edition for some of our processes in the company and i would like to make sure we are not violating any licenses :).

Executing bat file on remote machine from local machine in same network.

$
0
0
Hi All,

I have multiple machines on a same network. I have my pentaho solution installed on a machine say machine A.
I have created a executeJob.bat file on machine A. I execute pentaho ETL job by executing executeJob.bat file on machine A.
Now I want to execute this executeJob.bat file on the same machine A from a remote machine say machine B or machine C which are in same network.
Please help me how can I achieve this.


Thanks
Ajinkya

PRD Dynamic Environment Variables

$
0
0
Hi members! I am working with Pentaho Server 6.1 and I need to integrate a report (made in PRD 5.4). The report works as I expect but I must find a solution to a little problem to put the report in production environment. The report is using a CDA data set and I need to specify a dynamic "server URL" (according with the customers environment). I have tried using "Server URL field" but the environment variables have a default values. The question is: How can I set dinamically the environment variables? Thank you for reading!

Slow Load to Database

$
0
0
I Have the same problem. A load of one milion (daily) records takes an average of 5 hours to run. Any solution?

How to find coresponding line in file?

$
0
0
I use "Get File Names=> Mail" steps to send attachments from some directory. Between them I need step that will check the correct address for each attachment.

For instance short_filename is 1098235.txt so I need to find line in some text file that contains '1098235 some@address.com' (earlier I will cat '1098235' from short_filename field... using JavaScript) and use e-mail addres for Mail step.

So the file that maps numbers with e-mail address could have content:

company_number; email_address
1098234; x@address.com
1098235; some@address.com
1098236; z@address.com
..; ...

Any advice?

Directory name variable in Reporting Designer

$
0
0
I use PDI where in transformation I have Pentaho Reporting Output step.
In report I use PDI transformation as data source. Path to this transformation is static path, for instance "c:\pdi\workspace\some_job\transformation1.ktr"

Problem is that when I move tranformation to another directory I have to redefine path. How to make it dynamic directory variable? Someting like ${Internal.Entry.Current.Directory} in PDI?

Regards

Directory variable in PRD?

$
0
0
Forgive me that on Kettle forum but PRD oneis dead - I never got there any answer. Maybe someone will tell you how to solve:

I use PDI where in transformation I have Pentaho Reporting Output step.
In report I use PDI transformation as data source. Path to this transformation is static path, for instance "c:\pdi\workspace\some_job\transformation1.ktr"

Problem is that when I move tranformation to another directory I have to redefine path. How to make it dynamic directory variable? Someting like ${Internal.Entry.Current.Directory} in PDI?

Regards

LDAP Code: 49. Error authenticating user 80090308: LdapErr: DSID-0C09042F

$
0
0
Hello everyone,


it's been several days since I can not solve this problem. I wanted to connect Pentaho Data Integration 8.1 to Active directory, but I can not connect.
Can you help me please:

Custom Connection URL : jdbc:activedirectory:User=CN=xxxx,OU=Domain Controllers,DC=xxx,DC=local;Server=192.xx.xx.xx;Port=389;
Custom Driver Class Name : cdata.jdbc.activedirectory.ActiveDirectoryDriver
User Name : administrator
Password : XXXXX

here is the mistake that gives me :

java.lang.reflect.InvocationTargetException: Couldn't find any rows because of an error :org.pentaho.di.core.exception.KettleDatabaseException:
An error occurred executing SQL:
SELECT * FROM Computer
LDAP Code: 49. Error authenticating user 80090308: LdapErr: DSID-0C09042F, comment: AcceptSecurityContext error, data 52e, v2580

StAX Files processing - GC overhead limit exceeded

$
0
0
Hi everybody,
I've searched through several posts but couldn't find a solution to this problem.

I've got a Job (Job1) that reads from disk the name of all the XML files in a specific input folder.
This file names are then passed to another Job (Job2) that:
  1. Reads the file (StAX)
  2. Saves the content to a database, whether the content of the XML satisfies specific requirements
  3. Deletes the imported file from disk


Job2 Executes for every input row, so it runs for every input file.

The problem is: after several processed files, I receive an out of memory error "GC overhead limit exceeded".

Gievn the fact that I would like to be able to process thousands of files, how could I set my jobs structure in order to avoid this Memory Error?
I thought that the job option "Execute for every input row" releases the memory after every single processing run, but it seems it's not the case.

I'm sorry but I can't post the Job and Transformation files for privacy matters.

P.S.
Increasing the spoon memory limit is not a solution for me. :D

Any idea?

Thanks in advance!!!

HTTP post - proxy

$
0
0
For some reason Kettle is not able to use Proxy Host, response: failed to respond
Same proxy works in REST client step.

"HTTP post " can work with proxy?

Step "Wait for file" with different File name

$
0
0
Hi!

I get a file every day to use in my PDI solution, however, it comes with a specific name "CompanyName_LoadBI_YYYYmmDD.HHMM.zip".
Because date and time are variables, I can not put a specific name in "Wait for file" -> "File name".
Remembering that there are not multiple files, it will always be the same file once a day almost always at the same time, but with different date / time.


I tried to do the following in the "File name" field:
C:\Users\Company\upload\.*CompanyName_LoadBI.*\.zip$


But to no avail.
Can you help me?


Thanks! :)

Clarification on Pan and Kitchen

$
0
0
Hi Team,

I am able to run transformation (.ktr) using AEL(using spark engine) in PDI client.

So my question is,

How do i run transformation using pan.sh in AEL(spark engine).?

Remote connection

$
0
0
Please, how do I enable a remote connection (Spoon) ? Thanks
Viewing all 16689 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>