Import from Text File Imput filtering with data from Table Input

September 4, 2015, 3:56 am

≫ Next: Blocks of multiple data sources to be placed in different pages

≪ Previous: Gotchas With Upserts + MySQL

I tried everything, I´m so new in Pentaho.

My situation, I cannot use jobs, just transformations. Because the ktr goes into another program that only allows transformations.

I have to sources of data:

- I have a txt file with all the employees of a company, one field is the country.
- I have a field in the Countries table of a Database. This field says if I have the import or not the data from every country.

I have to get in the output the users from the countries that have the parameter allowed to import. So I have to mix a TXT file and a Table of the Database.

- The user data of the required countries will output to a table in the database.

I tried all combinations I can imagine, but because of the order of execution didn´t work. Any ideas??

The closest I was was this attemp. What I did was first delete the final table of the users. Then just import all the users to the table and then execute a SQL file to delete the employees that should not be in the table because of the country. The delete works fine in SQL Developer, but because of the order of execution on Pentaho, it executes so early before any row is in the table. The blocking step seems to do nothing.

2015/09/04 13:16:25 - Usuarios People.0 - Opening file: C:\Users\XE52305\Desktop\IN\Interfaces\Em_PeopleSoft_low.txt
2015/09/04 13:16:25 - Limpiar Tabla Buffer.0 - Finished reading query, closing connection.
2015/09/04 13:16:25 - Limpiar Tabla Buffer.0 - Finished processing (I=0, O=0, R=0, W=1, U=1, E=0)
2015/09/04 13:16:25 - Execute SQL script.0 - Finished reading query, closing connection.
2015/09/04 13:16:25 - Execute SQL script.0 - Finished processing (I=0, O=0, R=0, W=1, U=1, E=0)
2015/09/04 13:16:25 - Usuarios People.0 - Finished processing (I=99, O=0, R=0, W=99, U=99, E=0)
2015/09/04 13:16:26 - Insert / Update.0 - Finished processing (I=99, O=99, R=99, W=99, U=99, E=0)
2015/09/04 13:16:26 - Spoon - The transformation has finished!!

error1.jpg.

Thank you

Attached Images

error1.jpg (10.5 KB)

↧

Blocks of multiple data sources to be placed in different pages

September 4, 2015, 4:09 am

≫ Next: File watcher for multiple files daily

≪ Previous: Import from Text File Imput filtering with data from Table Input

Hello,

I'm quite new to Pentaho.
I have a JSON Array with this structure:

Code:

[

  {

    "documents": 

      [

        {

           "KEY_DOC_1": "VAL_DOC_1_11",

           "KEY_DOC_2": "VAL_DOC_2_11",

           ....

        },

        {

           "KEY_DOC_1": "VAL_DOC_1_12",

           "KEY_DOC_2": "VAL_DOC_2_12",

           ....

        },

        ...

      ],

      "actors": 

      [

        {

           "KEY_ACT_1": "VAL_ACT_1_11",

           "KEY_ACT_2": "VAL_ACT_2_11",

           ....

        },

        {

           "KEY_ACT_1": "VAL_ACT_1_12",

           "KEY_ACT_2": "VAL_ACT_2_12",

           ....

        },

        {

           "KEY_ACT_1": "VAL_ACT_1_13",

           "KEY_ACT_2": "VAL_ACT_2_13",

           ....

        },

        ...

      ]

  },

  ...

]

I retrieve such JSON Array through PDI. The structure of it is hypothetically good for what I have to achieve (explained later), but I may need to change it a bit, you tell me.
What I have to do in the Report Designer is placing one pair <DOCUMENTS_ARRAY,ACTORS_ARRAY> for each page in the report, i.e., for each page I must have a list of documents and a list of actors.
Let's see an example. Say this is my JSON Array:

Code:

[

  {

    "documents": 

      [

        {

           "key_doc_1": "val_doc_1_11",

           "key_doc_2": "val_doc_2_11",

           "key_doc_3": "val_doc_3_11",

        },

        {

           "key_doc_1": "val_doc_1_12",

           "key_doc_2": "val_doc_2_12",

           "key_doc_3": "val_doc_3_12",

        }

      ],

      "actors": 

      [

        {

           "key_act_1": "val_act_1_11",

           "key_act_2": "val_act_2_11",

           "key_act_3": "val_act_3_11",

           "key_act_4": "val_act_4_11"

        },

        {

           "key_act_1": "val_act_1_12",

           "key_act_2": "val_act_2_12",

           "key_act_3": "val_act_3_12",

           "key_act_4": "val_act_4_12"

        },

      ]

  },

  {

    "documents": 

      [

        {

           "key_doc_1": "val_doc_1_21",

           "key_doc_2": "val_doc_2_21",

           "key_doc_3": "val_doc_3_21",

           "key_doc_4": "val_doc_4_21"

        }

      ],

      "actors": 

      [

        {

           "key_act_1": "val_act_1_21",

           "key_act_2": "val_act_2_21",

           "key_act_3": "val_act_3_21",

           "key_act_4": "val_act_4_21"

        },

        {

           "key_act_1": "val_act_1_22",

           "key_act_2": "val_act_2_22",

           "key_act_3": "val_act_3_22",

           "key_act_4": "val_act_4_22"

        },

        {

           "key_act_1": "val_act_1_23",

           "key_act_2": "val_act_2_23",

           "key_act_3": "val_act_3_23",

           "key_act_4": "val_act_4_24"

        }

      ]

  },

  {

    "documents": 

      [

        {

           "key_doc_1": "val_doc_1_31",

           "key_doc_2": "val_doc_2_31",

           "key_doc_3": "val_doc_3_31"

        }

      ],

      "actors": 

      [

        {

           "key_act_1": "val_act_1_31",

           "key_act_2": "val_act_2_31",

           "key_act_3": "val_act_3_31",

           "key_act_4": "val_act_4_31"

        }

      ]

  }

]

For the example above I would get three pages. This is how they are supposed to look like:

FIRST PAGE

Documents

key_doc_1	key_doc_2	key_doc_3
val_doc_1_11	val_doc_2_11	val_doc_3_11
val_doc_1_12	val_doc_2_12	val_doc_3_12

Actors

key_act_1	key_act_2	key_act_3	key_act_4
val_act_1_11	val_act_2_11	val_act_3_11	val_act_4_11
val_act_1_12	val_act_2_12	val_act_3_12	val_act_4_12

SECOND PAGE

Documents

key_doc_1	key_doc_2	key_doc_3
val_doc_1_21	val_doc_2_21	val_doc_3_21

Actors

key_act_1	key_act_2	key_act_3	key_act_4
val_act_1_21	val_act_2_21	val_act_3_21	val_act_4_21
val_act_1_22	val_act_2_22	val_act_3_22	val_act_4_21
val_act_1_23	val_act_2_23	val_act_3_23	val_act_4_21

THIRD PAGE

Documents

key_doc_1	key_doc_2	key_doc_3
val_doc_1_31	val_doc_2_31	val_doc_2_31

Actors

key_act_1	key_act_2	key_act_3	key_act_4
val_act_1_31	val_act_2_31	val_act_3_31	val_act_4_31

How to proceed? I tried to do something but I really have no clue about how to accomplish the task. Should I transform the original JSON?

EDIT: In Kettle I managed to transform data in order to have the following out of a "JSON Input" block (i see that in preview mode):

Code:

OUT_DOCUMENTS                                        OUT_ACTORS

[{ <DOC11_DATA> },{ <DOC12_DATA> }]        [{ <ACT11_DATA> },{ <ACT12_DATA> }]  

[{ <DOC21_DATA> }]                                    [{ <ACT21_DATA> },{ <ACT22_DATA> },{ <ACT23_DATA> }]  

[{ <DOC31_DATA> }]                                    [{ <ACT31_DATA> }]

Now what?

↧

File watcher for multiple files daily

September 4, 2015, 4:34 am

≫ Next: CDE dashboard Error processing component?

≪ Previous: Blocks of multiple data sources to be placed in different pages

Hi,

In my folder there are 15 files based on date . Using file watcher I have to check if files are present in given folder for give date and execute the transformation job accordingly. I have gone through forum but didn't find any solution.

Can someone help me how to resolve this issue I am facing

Regards
AmeyP

↧

CDE dashboard Error processing component?

September 4, 2015, 9:00 am

≫ Next: Using Match in Job Hop

≪ Previous: File watcher for multiple files daily

Hi guys!
I'm new with pentaho, and i've got some problems when I tried to create dashboard with CDE.
1. when I tried to use the template. there was a warning that said "WARNING: Dashboard Layout will be overwritten!"
2. when i tried to preview the dashboard, there was an error occured and said 'error processing component'

I'm looking forward for your answers,I have to finish this pentaho project ASAP. thankyou very much*
log's attached
*sorry for bad english

Attached Files

catalina.2015-09-04.log (456.1 KB)

↧

Using Match in Job Hop

September 4, 2015, 1:17 pm

≫ Next: spoon 3.2.2 is not showing all steps in design mode.

≪ Previous: CDE dashboard Error processing component?

We have a job that is processing multiple files. Each file, depending on the filename, branches to a transformation that processes it. We basically use a Java Script 'test' step with something like the following code:

Code:

parent_job.getVariable("FILENAME").startsWith("bk");

If true, we branch to the transformation, otherwise branch to the next 'test'.
However, now we have files coming over from a new client where the filename is preceded with a client identifier. I don't like using substring because I could have a file that starts with 'ca' and another that starts wth 'cab', in whcih case I need to start testing for length and other things. I came up with the idea of using .match() function, but am having trouble getting it to work. We are using (don't laugh) version 3.2.0.
I have tried both the following (all files are in the format of a three character client code, the filename, then yymmdd to signafy a date. There are no file extensions):

Code:

parent_job.getVariable("FILENAME").match(/^([a-z]{3}bk[0-9]{6})$/i);

and

Code:

parent_job.getVariable("FILENAME").match(^([a-z]{3}bk[0-9]{6})$);

but neither one seems to process any files. It is my understanding that the above should return either true or false, so no 'testing' of the result is necessary. Can someone help?

I have attached the job.

Attached Files

DataLoading.kjb (23.6 KB)

↧

spoon 3.2.2 is not showing all steps in design mode.

September 5, 2015, 11:56 pm

≫ Next: BI Server scheduler page

≪ Previous: Using Match in Job Hop

Hello guru's

I am working on spoon newly. I have an issue with displaying all steps in spoon with version 3.2.2. like input folder and all the steps within input folder this is in design mode. On the other hand my team member is able to have all the steps Please help me to resole this issue.

↧

BI Server scheduler page

September 6, 2015, 8:52 am

≫ Next: Pentaho unable to copy files to Hadoop HDFS file system 1.0.3

≪ Previous: spoon 3.2.2 is not showing all steps in design mode.

I installed BI Server CE. Logged in as admin. I can see one schedule on the schedules page. There is no button to create new schedules. How do i get to schedule xaction using BI Server CE?

Thanks!

↧

Pentaho unable to copy files to Hadoop HDFS file system 1.0.3

September 6, 2015, 11:43 pm

≫ Next: Consolidated Report Using Spoon Tool

≪ Previous: BI Server scheduler page

Hi All,
This is my first thread and am using using 5.4.0.1-130 Pentaho kettle version.

I have installed hadoop-1.0.3 version in a VM player and I have bridged it using bridged network.

I have Pentaho installed on my desktop on Windows10 and the hadoop is available in the above mentioned VMplayer.

I'm trying to do a "Hadoop Copy File" job, but it fails with the following error.

SourceEnvironment: <static>
SourceFile/Folder: file:///C:/Study/Pentaho/data-integrationC:/Study/Pentaho/data-integration
DestinationEnvironment: <static>
Destination File/Folder: hdfs://notroot/hadoop123@192.168.139.128:8020/input

I tried to copy and create a folder C:\Study\Pentaho\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hadoop-103\lib as per the instruction mentioned in this website(http://funpdi.blogspot.in/2013/03/pe...nd-hadoop.html), still no luck.

Kindly advice what am I doing wrong? Thank you!

015/09/07 11:56:02 - Hadoop Copy Files - Processing row source File/folder source : [file:///C:/Study/Pentaho/data-integrationC:/Study/Pentaho/data-integration] ... destination file/folder : [hdfs://notroot/hadoop123@192.168.139.128:8020/input]... wildcard : [^.*\.txt]
2015/09/07 11:56:04 - Hadoop Copy Files - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Can not copy file/folder [file:///C:/Study/Pentaho/data-integrationC:/Study/Pentaho/data-integration] to [hdfs://notroot/hadoop123@192.168.139.128:8020/input]. Exception : [
2015/09/07 11:56:04 - Hadoop Copy Files -
2015/09/07 11:56:04 - Hadoop Copy Files - Unable to get VFS File object for filename 'hdfs://notroot/hadoop123@192.168.139.128:8020/input' : Could not resolve file "hdfs://notroot/hadoop123@192.168.139.128:8020/input".
2015/09/07 11:56:04 - Hadoop Copy Files -
2015/09/07 11:56:04 - Hadoop Copy Files - ]
2015/09/07 11:56:04 - Hadoop Copy Files - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : org.pentaho.di.core.exception.KettleFileException:
2015/09/07 11:56:04 - Hadoop Copy Files -
2015/09/07 11:56:04 - Hadoop Copy Files - Unable to get VFS File object for filename 'hdfs://notroot/hadoop123@192.168.139.128:8020/input' : Could not resolve file "hdfs://notroot/hadoop123@192.168.139.128:8020/input".
2015/09/07 11:56:04 - Hadoop Copy Files -
2015/09/07 11:56:04 - Hadoop Copy Files -
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:154)
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:102)
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.ProcessFileFolder(JobEntryCopyFiles.java:421)
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.execute(JobEntryCopyFiles.java:375)
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.job.Job.execute(Job.java:716)
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.job.Job.execute(Job.java:859)
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.job.Job.execute(Job.java:532)
2015/09/07 11:56:04 - Hadoop Copy Files - at org.pentaho.di.job.Job.run(Job.java:424)
2015/09/07 11:56:04 - pentaho_to_hadoop_ex3ktr - Finished job entry [Hadoop Copy Files] (result=[false])
2015/09/07 11:56:04 - pentaho_to_hadoop_ex3ktr - Job execution finished
2015/09/07 11:56:04 - Spoon - Job has ended.

↧

Consolidated Report Using Spoon Tool

September 6, 2015, 11:59 pm

≫ Next: Pentaho getFileName Error

≪ Previous: Pentaho unable to copy files to Hadoop HDFS file system 1.0.3

Dear Experts,

Hope, you are all doing good? I have a requirement to create the CMDB Report from Computer System, Operating System, Application, Physical Location classes. The report will contains following Columns,
1. Server Name (From Computer System Class)
2. HostName (From Computer System Class)
3. OS (From Operating System Class)
4. Service Pack (From Operating System Class)
5. Location (From Physical Location Class)

Per my understanding, it's achievable via Penatho Spoon Tool. I have tried to create the transformation like attached transformation. IT notifies only link between CS and OS Classes. if it, works will join the another n number classes. But no luck. Expected report is not outcoming.

Join Rows - step holds the logic as CS class name is equal to OS Class System Name.

Can you please guide me the effected approach to complete this task?

Eagerly, waiting for your reply.

Regards,
Suresh.

↧

Pentaho getFileName Error

September 7, 2015, 1:19 am

≫ Next: Check text-file bases on change date

≪ Previous: Consolidated Report Using Spoon Tool

I try to send mails through pentaho DI, This is a my simple transformation I have 20 trasnformation like below and all are run job one by one in a single
get Files Names----> add Constants---->Mai

But When I run this some mails are not send with below error and also some mail success but actualy mail not received to destination

Get File Names.0 - Following required files are not accessible: file:///D:/FTP/DAILY_REPORT/IN_VS_MSC_DAILY_TOTAL

Anyone tell me what is the reson for that

This is a error massage for not success jobs

error.jpg

Attached Images

error.jpg (23.5 KB)

↧

Check text-file bases on change date

September 7, 2015, 3:20 am

≫ Next: Want to remove negative scale on x-axis

≪ Previous: Pentaho getFileName Error

Does anyone now how I can check a text-file (word,xml) bases on the changed date? (Not the created date)
For example: I want to select the files that where changed in the last 24 hours.

Thank,
Kristof

↧

Want to remove negative scale on x-axis

September 7, 2015, 3:56 am

≫ Next: Preserve Memory from Stream Lookup

≪ Previous: Check text-file bases on change date

Hi All,

I am using waterfall chart in my dashboard and the problem is that it is showing negative scale on x-axis when i am selecting different date range and this negative scale on x-axis becomes for random date range.Sometime for a range it shows negative scale or sometimes not.Please give me the solution of this problem.

Thanks and Regards
Anuj

Attached Images

untitled.bmp (401.7 KB)

↧

Preserve Memory from Stream Lookup

September 7, 2015, 4:48 am

≫ Next: TimePlot Component in pentaho CE 5.4 with cde dashboard

≪ Previous: Want to remove negative scale on x-axis

Hi,

Can anyone let me know the difference between checking and unchecking the Preserve Memory (Cost CPU) text box from Stream Lookup step in Pentaho. I have more than 2 millions of records from source table for one transformation, where as other has around 10k records. Enabling the Preserve Memory for both the transformations, will make any difference.

Regards,
Sandeep C

↧

TimePlot Component in pentaho CE 5.4 with cde dashboard

September 7, 2015, 5:19 am

≫ Next: 【夏休み・自由研究編】pdiをソースコードからビルドする

≪ Previous: Preserve Memory from Stream Lookup

Hello All,
How can use the TimePlot Component in pentaho CE 5.4 with cde dashboard?

its very basis please let me?

↧

【夏休み・自由研究編】pdiをソースコードからビルドする

September 7, 2015, 9:42 am

≫ Next: Date Parameter in Report Designer not working

≪ Previous: TimePlot Component in pentaho CE 5.4 with cde dashboard

久しぶりの投稿となります。

Pentahoブログに下記の記事を投稿しました。不明な点、わかりづらい点がありましたら、こちらのフォーラムまで投稿をお願いします。

【夏休み・自由研究編】PDIをソースコードからビルドする
http://www.pentaho-partner.jp/blog/2015/08/pdi-11.html

【夏休み・自由研究編】PDIをソースコードからビルドする(2)
http://www.pentaho-partner.jp/blog/2015/08/pdi2-1.html

【夏休み・自由研究編】PDIをソースコードからビルドする(3)
www.pentaho-partner.jp/blog/2015/09/pdi3.html

↧

Date Parameter in Report Designer not working

September 7, 2015, 11:20 am

≫ Next: report variable question?

≪ Previous: 【夏休み・自由研究編】pdiをソースコードからビルドする

Hi

I am relatively new to the Pentaho Report Designer. I am using version 5.3

I have a formula for a field in the report:

IF(AND([INVOICE_DATE] = [End Date]; [ROUTE_ID] = 103; [UNIT_QUANTITY];0)

where 'End Date' is a parameter. I am not getting the Unit_Quantity data.

However if I use the following code:

IF(AND([INVOICE_DATE = Date(2015;07;29); [ROUTE_ID]=103; [UNIT_QUANTITY];0)

the data appears.

Can anyone advise me n this?

Regards

Gerald Mollineau

↧

report variable question?

September 7, 2015, 11:42 pm

≫ Next: Check Command Error in a .bat file

≪ Previous: Date Parameter in Report Designer not working

hi, i have records for buy/sell transactions and records are grouped by transaction type buy or sell. my report shows buy and sell totals at the end of each group. i need to find the difference between buy and sell sums. i thought i can set a variable at the end of group footer, using group field value as name but i couldn't figure out how. any idea for accomplishing this result. thanks.

↧

Check Command Error in a .bat file

September 8, 2015, 1:08 am

≫ Next: Weka do not handle CSV files through command line

≪ Previous: report variable question?

Hi guys, as I wrote in the title I would like to know how to check in a IF ELSE condition if a command was correctly executed.

Thank you very much.

↧

Weka do not handle CSV files through command line

September 8, 2015, 4:15 am

≫ Next: Making dashboards that users can customize themself?

≪ Previous: Check Command Error in a .bat file

Hi all,

Currently, I'm working with Weka-dev 3.7.12 version.
I get an unfriendly ArrayOutOfBoundsException bellow when I try to input a CSV file to Weka. But it works fine when I use the same in the GUI or ARFF file

Quote:

java.lang.ArrayIndexOutOfBoundsException: 1
at com.bosch.weka2odd.cgenerator.evaluation.Evaluation.setPriors(Evaluation.java:3670)
at com.bosch.weka2odd.cgenerator.evaluation.Evaluation.evaluateModel(Evaluation.java:1437)
at com.bosch.weka2odd.cgenerator.cGenerateAdapter.impl.CcodeAdapterImpl.classifyIntoCCode(CcodeAdapterImpl.java:44)
at com.bosch.weka2odd.controller.impl.ControllerImpl.classifyWithCOutput(ControllerImpl.java:33)
at com.bosch.weka2odd.cli.impl.WekaCLIImpl.main(WekaCLIImpl.java:73)

After that, i try to research some information from internet and i get this link: http://stackoverflow.com/questions/2...with-csv-files

Anybody know does Weka have any plan or solution to solve this problem because i need exactly is handle the CSV files through command line in my project. Not GUI or ARFF!
Please help!

↧

Making dashboards that users can customize themself?

September 8, 2015, 6:36 am

≫ Next: Using "Replace in String"

≪ Previous: Weka do not handle CSV files through command line

Hi!

I'm currently looking at creating a dashboard that users can customize themself. I'm familiar with CDF/CDE and Saiku and would like to use these but allow the user in an easy way to customize what charts and tables they like to see by a drag and drop or similar intuitive UI. Has anyone tried to do anything similar or any ideas on how to best do this?

I would like to avoid creating too much own functionality and instead rely on what already exists in the community. I've tried out the new plugins Self Service BI and Ivy Dashboard Designer, they are promising and I'm considering helping out in the development of these.

↧