Quantcast
Channel: Pentaho Community Forums
Viewing all 16689 articles
Browse latest View live

Using "Replace in String"

$
0
0
Hello Everybody,

I am new to Pentaho Kettle CE 4.4 (Stable)
My requirement is to read the Cust_Name from an excel input and replace the below characters/string in the values.

1. Any kind of extra space/s
2. " AND " starting and ending with space/s
3. " & " starting and ending with space/s
4. "." Dots just mere dots because any space before and after would be taken care by point number 1
5. "-" Dash
6. M/s

Now I did use Replace in String transformation, however the output isnt as expected.
Spoon converts and deletes even the alphabets before and after dots, dashes,etc. which have been mentioned above.
In few cases it doesnt remove the dots (.)

Do we have a precedence or an order for this?

Would appreciated any kind of answer!!!

PDI on AWS

$
0
0
About a month ago I thought I had found pricing for Pentaho PDI on AWS but now I cannot find that information.

The initial answer I get from Pentaho sales is that it "can be used on AWS". I know it can be installed on EC2 and run there but we use a lot of AWS and our application is using Redshift so would be preferable if could just fire up an EC2 instance and be billed with AWS style pricing.

So if someone is doing this would love to hear your experience or the flip side if people are just running PDI on EC2 what is your experience and are you just using CE or EE.

How to delete a 'Group'

$
0
0
I had created a sub-group inside my main 'Group' section that performed aggregates on a field. I now wish to delete the sub-group. How can I do this?

- Going to Edit > Delete is grayed out.
- Hitting delete on the keyboard when selecting the sub-group does not work.
- 'Cutting' the group leave a 'null' sub-group

Pentaho embeed in external Application Authentication

$
0
0
Hello, i need to embeed some BA-Server 5.4 reports in a external application without Login. The "Embed BA into Web Applications" pentaho help documentation, contains different ways for integrate Analysis, Interactive reports and Dashboards with external app. But no contains Authentication modes for each. Searching in pentaho infocenter, found a post (https://help.pentaho.com/Documentation/5.3/0P0/000/090) to help "Configuring the BA Server to Accept Authentication Credentials in a URL". I did and really work to execute Interactive reports and Dashboards with user and password in a URL, but when execute an Analysis Report ask me user and password 1st time (if clear browser cache ask credentials). I need to know how to solve that problem and if it's possible to provide authentication via method POST. Thanks you for your help, Yunier.

Database Join Step Not Working At Load.

$
0
0
Hi,
Here is the basic set up of what I am trying to do. I pick up a list of people that may or may not have to pay taxes of one sort or the other based on the products that they have.
I have Database Join step that effectively looks up the customer's address and based on the zip and city.
The query is:

Select count(*) as Taxable
from TaxTable
where
zip = ?
and
city = ?
and
isTaxable = 1

Where I pass the corresponding values to the ? parameters from the customers record.

The problem is that when I look at the output, customers that should have been taxed - they are excluded. I filter the output of this step and record any customers that are not taxed. So I can see that the Taxable field that I added to the flow record is set as 0.

The strange thing is that if I modify my input query to pick up a customer that I know will not be taxed, it has Taxable = 0, and works correctly. On the other hand if I use a customer that IS taxable, it comes up as Taxable = 1 and the customer is put in a file to be taxed.

When I use the same taxable customer at load (using 600K customers as input) the customer that SHOULD be taxed ends up in the non-taxed exclusion file....

What is going on?

Removing character (,) from the end of the second last row?

$
0
0
I am trying to create a TEXT file output step which is formatted very similiar to JSON. I have it working with just one issue, for the second to last row I need the final comma (,) removed. Here is an example:

After my Transformation, the following is a sample file created:

{"nodes":[
{"id": "tsom.256d1276-41d6-11e5-a5ef-b72ef4e0a925", "label": "dcesx2", "text": "dcesx2", "image": "img/models/ESXi.png", "shape": "circularImage", "color": { "background":"blue", "border":"blue", "highlight":{ "background":"blue","border":"blue"}}},
{"id": "tsom.256d1278-41d6-11e5-a5ef-b72ef4e0a925", "label": "dcesx4", "text": "dcesx4", "image": "img/models/ESXi.png", "shape": "circularImage", "color": { "background":"green", "border":"green", "highlight":{ "background":"green","border":"green"}}},
...
{"id": "tsom.4928b787-41d6-11e5-a5ef-b72ef4e0a925", "label": "NIT", "text": "NIT", "image": "img/models/BMC_BusinessService.png", "shape": "circularImage", "color": { "background":"green", "border":"green", "highlight":{ "background":"green","border":"green"}}},
]}

I need the final to last row to NOT have the comma (,) at the end. So the end result is:

{"nodes":[
{"id": "tsom.256d1276-41d6-11e5-a5ef-b72ef4e0a925", "label": "dcesx2", "text": "dcesx2", "image": "img/models/ESXi.png", "shape": "circularImage", "color": { "background":"blue", "border":"blue", "highlight":{ "background":"blue","border":"blue"}}},
{"id": "tsom.256d1278-41d6-11e5-a5ef-b72ef4e0a925", "label": "dcesx4", "text": "dcesx4", "image": "img/models/ESXi.png", "shape": "circularImage", "color": { "background":"green", "border":"green", "highlight":{ "background":"green","border":"green"}}},
...
{"id": "tsom.4928b787-41d6-11e5-a5ef-b72ef4e0a925", "label": "NIT", "text": "NIT", "image": "img/models/BMC_BusinessService.png", "shape": "circularImage", "color": { "background":"green", "border":"green", "highlight":{ "background":"green","border":"green"}}}
]}



PLEASE advise and thank you for the support!

KP

Error while running BI server

$
0
0
Hi,

I tried running the BI server for the first time but am unable to do it.

After all the configurations, I entered the url and below is the screenshot of the error I got. Also I am attaching the pentaho.log file. Please tell me what has to be done now inorder to run the server.
Attached Images
Attached Files

Modifying Column header before exporting to Excel

$
0
0
Hello,

I am using exportData to generate an excel from mdx query but column headers are like "xxxxx.xxxxx". I want to replace that header to "xxxxx". In my code I cannot assign right header names before exporting.



var query = new Query("xxx.cda", "yyy");

query.fetchData(params, function(data){
if (data.resultset !== undefined){
var rs = data.resultset;
var header = [];
/*some code to assign header array*/

}
});

alert(header); //header array has proper values

query.exportData("xls", null, { filename:"text.xls",
columnHeaders: header
}); //Not working

Is PDI suitable to be executed through JAVA ?

$
0
0
Hi,

I wanted to know if PDI (kettle) is a suitable tool for this use case:

1) Create transformation or job through Spoon.
2) Execute the .kjb or .ktr through java.

Please provide your valuable answers.

Backslash issue with kettle propeties

$
0
0
Hi All,


I am getting isues when using backslash in Kettle properties.


I have to get files from the path \\Cname\share which is a shared location.
I am using get filename step.
I am getting error
Get File Names.0 - WARNING: Not accessible file:////Cname/share


When i used to take files from my own D drive, the path is Ok and transformation worked fine, and i am able to take files from here.. (D://foldername)


My kettle properties is
Path = \\Cname\\share


Could anybody give a hint ?


Thank You.

MongoDB output. Add $currentDate field

$
0
0
MongoDB have possibility to add field with current date described http://docs.mongodb.org/manual/refer...e/currentDate/
Example update query will be

db.users.update(
{ _id: 1 },
{
$currentDate: {
my_last_update_date: true
},
$set: {
//..some another fields
}
})

Is it possible to use that modifier $currentDate in MongoDB Output step?

Lectura Table Input a tabla AS/400 no se ejecuta en version 5.4

$
0
0
En la version 5.4 de spoon estoy utilizando Linux no se ejecuta la consulta, la misma version en windows si se ejecuta la consulta, no se si en Linux hace falta algun complemento, porque ejecuto el spoon de version anterior como 4.0 y si lo hace,,, favor ayudarme.. la pantalla se queda asi:

Imagen1.jpg

Saludos,
Attached Images

Using Data Validation Step for Date Validation

$
0
0
Hello,

I have created a sample file and a basic transformation to test several validations using the Data Validator Step. I am running into issues when I attempt to validate the date.
The expected date format is yyyy/MM/dd. In the test input file, I have created a date that is in the incorrect format of MM/dd/yyyy. I am expecting the date validator step to catch this error and write it out to the output file, but instead, I am receiving the following error:
- Data Validator.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected error
2015/09/09 09:52:41 - Data Validator.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : org.pentaho.di.core.exception.KettleValueException:
2015/09/09 09:52:41 - Data Validator.0 - Transaction Date String : couldn't convert string [01/01/2015] to a date using format [yyyy/MM/dd] on offset location 10
2015/09/09 09:52:41 - Data Validator.0 - Unparseable date: "01/01/2015".

I have attached my sample file as well as my transformation.
I have also attempted to set the field in the incoming step as a string and then use a select values to change it to date, but this also does not work.

Any help would be greatly appreciated!
Attached Files

Excel Input step. "Current state not START_ELEMENT" exception

$
0
0
Hello,

When I tried to load a large Excel file with Microsoft Excel Input step I got an exception "Current state not START_ELEMENT".
Detailed exception info:

Code:

ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Error processing row from Excel file [...] : java.lang.RuntimeException: java.lang.IllegalStateException: Current state not START_ELEMENT
ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : java.lang.RuntimeException: java.lang.IllegalStateException: Current state not START_ELEMENT
    at org.pentaho.di.trans.steps.excelinput.staxpoi.StaxPoiSheet.getRow(StaxPoiSheet.java:167)
    at org.pentaho.di.trans.steps.excelinput.ExcelInput.getRowFromWorkbooks(ExcelInput.java:607)
    at org.pentaho.di.trans.steps.excelinput.ExcelInput.processRow(ExcelInput.java:441)
    at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalStateException: Current state not START_ELEMENT
    at com.ctc.wstx.sr.BasicStreamReader.getAttributeValue(BasicStreamReader.java:641)
    at org.pentaho.di.trans.steps.excelinput.staxpoi.StaxPoiSheet.getRow(StaxPoiSheet.java:137)

The file has approximately 200mb and 600000 rows in it. The error occurs in the middle of the file.
File is good, and can be opened with Excel.
Spread sheet type (engine) property of the Excel Input Step is "Excel 2007 XLSX (Apache POI Streaming)".

Could you please help me with this issue?

Thanks,
Ivan.

Trouble specifying text input file

$
0
0
I'm sure this is a simple question, but I don't understand how to do it.

My kettle job in job settings sets INPUT_FOLDER set to ${Internal.Job.Filename.Directory}/files
The job sets Arg 1 = ".*\.CSV".

This calls a transformation, which calls a Text File Input step. I want to process all .CSV files in $(input_folder}. That seems to work OK.
But how should I specify that the Input Text File step uses the value in command line argument 1?

Edit - attached files.
Attached Files

User defined java class - Multiple tabs

$
0
0
Hi Kettle coders,

in a project I'm using the UDJC to make a transformation mapping in order to mass-generate reports with JasperReports library (I use the UDJC since I don't have time to write a kettle step plugin, but thinking of it!).
Thus I have quite a lot of code, and want to split it into multiple tabs (classes) in the UDJC, as I could do in the javascript step and call "loadScriptFromTab(...)" in the main script.
However, I couldn't find information on how actually use/import my classes in the other tabs into the main Processor class...... can we and how do we do that?

I already tried to juste use the class or do a "import MyClassName" but it doesn't work.

Thanks for your help

PS: I could put all the classes in a jar and import them, but don't want because of some problems of class loading on the DI-Server and I want the code to be independant of the deployement of the application WAR (thus only the JasperReport library is in the WAR but not the code using it)...... all of this for obscure reasons of IT administrators.

Cannot read / copy another field in the same stream row

$
0
0
Hi all,

When I try to get another field in the same row using JS (without compatibility mode):

//Script here


var lookupIdentifier = null;


var idx1 = getInputRowMeta().indexOfValue("Identifier");
if ( idx1 >= 0 )
{
lookupIdentifier = row[idx1];
}

...

For the lookupIdentifier new field in the output stream I just can get values in the form:

[B@feffca
[B@fc5b35
...

I now this means that I'm trying to read a binary object, but idx1 seams to have the correct stream index value.

Interestingly, when I try to use a calculation step "Create a copy of field A" only to perform the same action, I get exactly the same result.

Finally, in others transformations I'm using this techniques and they are performing perfectly, so I'm assuming PDI is well installed and configured.

The transformation starts reading a csv file with ANSI format. All fields in the stream- apparently - behave well in all previous steps in this transformation, and just when I try to copy another field in the stream is where I'm finding this issue.

Has anyone detect this behavior? Any idea about how to solve it?

This is a great stopper, any help will be really appreciated.

Thanks in advance.

Problem With Analysis Report

$
0
0
Hiiii,
I'm populating data by using Analysis Report in Pentaho Community Edition, when i'm using nearly 7 or 8 fields in the report the fields are going over the layout and the side scroll bar for the layout is not working........ What should i need to do to make it happen

Thanks in Advance

Open Source Analysis Report Solutions

$
0
0
Hi Team,

Hope I'm not re-iterating the question.

I'm using Pentaho (mostly Kettle) for more than 4 years.
Well, we need to create Analysis reports using Kettle and Saiku Reports designer.
My question is, will these two tools suffice for creating analysis reports on community edition or will I have to purchase enterprise edition of Pentaho to fulfil this requirement.
I would prefer going with Community edition because of client's cost constraints.
Kindly enlighten what else need to be installed along with Saiku designer , for my analysis.
If there is a document that I can refer to for such open source BI solutions, it will be very useful.


Thanks & Regards,
Manoj.

Split Duplicates from Excel Input

$
0
0
Hello Everyone,

I am currently using Spoon (Kettle 4.4 stable)

Now I have a requirement as below:

1. Compare Company Name from 3 excel files [This file has only Company Names (which is String)]
2. Once comparison is done dump duplicate and unique records in separate files

Request all to let me know what can be the best possible way to get this...

Thanks in advance.
Viewing all 16689 articles
Browse latest View live