Quantcast
Channel: Pentaho Community Forums
Viewing all 16689 articles
Browse latest View live

Create Folder at Root Level in Pentaho Server 8.2

$
0
0
Hi to all,

in previews versions of Pentaho BA Server (ie 6.1) I could create a new Root Level Folder (next to Public and Home) using this REST API

PUT: http://10.10.10.10:8080/pentaho/api/...rs/%3AMyNEWDir

In 8.2 version is not possibile anymore, it returns a 403 Forbidden and I think the point is in this piece of code (row 86)

https://github.com/pentaho/pentaho-p...yResource.java

Why this beaviour? Also if you delete for an error the Public or Home Directory you are note able to recreate them and Pentaho server is now broken.

Any hint to bypass this problem?

Thank you.

How to formating date in Date Dimension with mondrian 4?

$
0
0
Hi,
I need help.


I have this part of my code with mondrian 3, I have very difficulties to rewrite the same Date dimension with modrian 4. I want to format my date
It is degenarate dimension


<Schema name="Commercial">
<Dimensions>...</Dimensions>
<Dimensions>...</Dimensions>
<Dimensions>...</Dimensions>
<Cube name="Commercial process" visible="true" cache="true" enabled="true">
<Table name="shipinvoice_v">
</Table>
<DimensionUsage source="invoice" > </DimensionUsage>
...
<Dimension type="TimeDimension" visible="true" highCardinality="false" name="ORDER DATE">
<Hierarchy name="Order Date" visible="true" hasAll="true">
<Level name="Year" visible="true" column="day_order_id" type="String" uniqueMembers="true" levelType="TimeYears" hideMemberIf="Never">
<CaptionExpression>
<SQL dialect="generic">
<![CDATA[to_char(c_order_dateordered::date, 'YYYY')]]>
</SQL>
</CaptionExpression>
</Level>
<Level name="Month" visible="true" column="day_order_id" type="String" uniqueMembers="false" levelType="TimeMonths" hideMemberIf="Never">
<CaptionExpression>
<SQL dialect="generic">
<![CDATA[TO_CHAR((extract(month from c_order_dateordered)),'fm00') || '/' || extract(year from c_order_dateordered)]]>
</SQL>
</CaptionExpression>
</Level>
<Level name="Day" visible="true" column="day_order_id" type="String" uniqueMembers="false" levelType="TimeDays" hideMemberIf="Never">
<CaptionExpression>
<SQL dialect="generic">
<![CDATA[to_char(c_order_dateordered::date, 'DD/MM/YYYY')]]>
</SQL>
</CaptionExpression>
</Level>
</Hierarchy>
</Dimension>
</Cube>
</Schema>




Regards,
Lotenbol

Pentaho 6.1 version not working

$
0
0
Hello Everyone,

We have been using pentaho 6.1 for last 2 years.
Since then we have not faced any spoon opening issue.
From last few days, whenever we tried to open Pentaho while clicking on spoon.bat file , it does not open and created very heavy files of .MDMP extension in the PENTAHO DATA INTEGRATION folder.
Then we cleared memory space, deleted the MDMP files(in GBs each file) and cache files from the server but still ended up with nothing.
At last we restarted the server and spoon started working fine.

Has anyone of you faced the same issue before or please suggest what should be its permanent solution rather than restarting the server again.


Regards,
Robin

Creating data source/data connections by code/sql

$
0
0
Hi.

I have a pentaho deploy on docker. It will bring a fresh database of the jackrabbit repository every time. I want to know if I can script something so when the deploy is created my new datasources are created.

What I'm trying to avoid is to have to pre-configure the datasources. I think this is done on the jackrabbit scripts but I can't find them.

Thank you.

Disabling a branch in a transformation

$
0
0
After experimenting with some "improved" stream/branch disable functionality, my big data transformation now suffers from congestion and comes to a halt! The transformation reads a single file and writes statistics to fact tables. There's more than 22'000'000 rows totally for this file. Which means everything has to run smooth and fast, or rows start to pile up.

Some sub-streams should be disabled depending on arguments given at start. The straight forward way to do this in Spoon is to:

1) Add "Get Variables" step. Add the variable which decides wheter the branch should be disabled or not.
2) Add "Filter" step, filter on the stream field set above. True: continue stream. False: disable stream.

However, this process means the same constant field is added 22'000'000 times, step 1) above. And the logic comparison is then done 22'000'000 times, step 2) above. That's 19'999'999 times more than necessary!

So I tried do make my own java code to test only once:

Code:

public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException
{
    Object[] r = getRow();


    if (r == null)
    {
        setOutputDone();
        return false;
    }
   
    if (first)
    {
        first = false;
        //Disable the stream / branch if ENABLE_BRANCH variable is not 'Y' (yes)
        String enable = getVariable("ENABLE_BRANCH", "NULL");
        if(!enable.equals("Y"))
        {
            setOutputDone();
            return false;
        }
    }
   
    r = createOutputRow(r, data.outputRowMeta.size());
    putRow(data.outputRowMeta, r);
    return true;
}

This works excellent with small data. The sub-branch to be disabled is immediately green/finished after receiving the first row. But the transformation freezes with big data. Why? What am I missing? Seems like the steps up-stream is still trying to send rows somehow. Why would they try to send rows when this step has already executed "setOutputDone()" and returned false?

I wish the filter step would accept variables!

Kettle 6.1 and XLSX File

$
0
0
Hi to all guys,
i need your help to understand a strange situation.
I need to import some data from an xlsx file that containt 16 sheet.

Actually the xlsx file has a dimension of 20MB.

If i try to import it as Apache POI Kettle crash... i've already tried to increase JVM options in spoon.bat but i don't have a large amount of memory in the server.

For this reason io try to import the file using Apache POI Streaming and it works like a charm.

For the previous operation i was using a test file.

When my import job was ready, i receive the real file, and the job runs correctly without importing a single row.

I've see that the only difference between the test file and the real file is this:
Test file: program name: microsoft excel
Real file: program name: apache poi


so it seems that id the xlsx file is created with Apache POI kettle is not able to read its content.

Any ideas?

Thanks in advance

Daily schedule skips a day every now and then [pentaho-server 7.1]

$
0
0
Hello,

I have a pentaho-server CE with more than 150 schedules that have different frequencies, monthly daily, weekly, etc. However I am having strange behavior every now and then, where a daily schedule is not executed every day. It is strangely happening every monday. Actually, today is the 3rd Monday in a row, when this daily schedule was not executed. Here is the screenshot from scheduler where you can see that there are 3 daily schedules, but only 2 were executed last night:

https://cdn.discordapp.com/attachmen...chedule_en.png

Is there anything, that could make this schedule to not execute at mondays? Its been up for few months already and it never happened before. I have also checked localhost.loq and catalina.out and all other logs but there is nothing there from the time when this schedule should be executed.

Remove value from multivalue LDAP attribute

$
0
0
What's a good strategy for removing a value from a semicolon separated multivalue field coming from an LDAP Input stream?

S3 CSV file input changes from PDI 6.1 to PDI 8.1

$
0
0
Hi,

We are able to connect S3 CSV file input step successfully in PDI 6.1 Version but not in 8.1 or higher. there is different in steps from version to version.

if you are able to help on PDI 8.1 version for S3 bucket "select bucket" option it will be good for me.

I gone through the youtube and pdi documentation but it is not having clarity for 8.1 version. how can we define S3 bucket access and secret key for 8.1 version.


thanks for your support.

Text File Output encoding ignored

$
0
0
Hi everybody!
I'm working with Pentaho 8.0, and I'm facing an encoding problem.

What I currently need to do is to create a simple csv file, with ISO-8859-1 encoding.
For testing purposes use a Table Input step as input, with a couple of strings from dual, and this is the only input of my Text File Output.

Despite in the "Content" tab, I specify the encoding above, the file generated is always encoded us-ascii, no matter what I set.

I've searched way a lot around but it seems I couldn't find a real solution to this problem, is that a bug?

Many thanks in advance!

com.sybase.jdbc4.jdbc.SybDriver

$
0
0
Hi Guys I need to connect to a database via the table input, but im not able to connect.

I've been prvovided that it is a sybase database and that the driver to be used = com.sybase.jdbc4.jdbc.SybDriver

Which connection type and access to I use in the table input in spoon?

Pentaho Report Designer 5.0.1 and MySQL 8.0 is not working

$
0
0
I am trying to connect mysql 8.0 in report designer and also added mysql-connector-java-8.0.16.jar in report-designer\lib\jdbc path.

no connection, showing error :

Code:

Error connecting to database [test] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database


Error connecting to database: (using class org.gjt.mm.mysql.Driver)
Could not create connection to database server.




org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database


Error connecting to database: (using class org.gjt.mm.mysql.Driver)
Could not create connection to database server.




    at org.pentaho.di.core.database.Database.normalConnect(Database.java:415)
    at org.pentaho.di.core.database.Database.connect(Database.java:353)
    at org.pentaho.di.core.database.Database.connect(Database.java:306)
    at org.pentaho.di.core.database.Database.connect(Database.java:294)
    at org.pentaho.di.core.database.DatabaseFactory.getConnectionTestReport(DatabaseFactory.java:84)
    at org.pentaho.di.core.database.DatabaseMeta.testConnection(DatabaseMeta.java:2459)
    at org.pentaho.ui.database.event.DataHandler.testDatabaseConnection(DataHandler.java:541)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

using date variable in dateDiff in javascript step

$
0
0
I set a variable in another transformation...it is read in as a date. In the transformation with the error I do this:

Code:

var now_time = new Date();
var sixty_second_count = getVariable("SIXTY_SECOND_COUNT", 0);
var sixty_second_count_plus_one = sixty_second_count + 1;
var sixty_second_start = getVariable("SIXTY_SECOND_START", now_time);


if (sixty_second_count == 19)
    {
    writeToLog("m", "We hit 19 requests");
    setVariable("SIXTY_SECOND_COUNT", 0, "r");
    java.lang.Thread.sleep(60 - dateDiff(sixty_second_start, now_time, "ss"));
    writeToLog("m", "Done waiting for 60 seconds to end");
    setVariable("SIXTY_SECOND_START", new Date(), "r");
    }
else
    {
    setVariable("SIXTY_SECOND_COUNT", sixty_second_count + 1 , "r");
    writeToLog("m", "60 second count now at: " + sixty_second_count_plus_one);
    writeToLog("m", "Time into 60 second interval: " + dateDiff(sixty_second_start, now_time, "ss"));
    }


if (dateDiff(sixty_second_start, now_time, "ss") >= 60)
    {
    setVariable("SIXTY_SECOND_START", new Date(), "r");
    setVariable("SIXTY_SECOND_COUNT", 0, "r");
    writeToLog("m", "We reached 60 seconds and reset start time and 60 second interval count");
    }

I'm getting this error:

Quote:

2019/04/29 16:53:27 - js get req body.0 - org.mozilla.javascript.EvaluatorException: Cannot convert 2019-04-29 16:53:27 to java.util.Date (script#41) (script#41)
Why does the dateDiff function need to convert the date if it is already a date?

Logging Best Practice and "Job ID"

$
0
0
Hey there ---

I'm re-imagining my logging procedures. Trying to use more modular/ standard templates.
I've already done this in SSIS. Pentaho PDI has its uses too of course.

One thing I noticed is useful is a universal logging table -- not logging in separate tables for separate ETL.

One thing SSIS has that Pentaho PDI doesn't is a 'generated' unique job ID (Numeric) that doesn't change with name changes.

Pentaho doesn't have unique Job IDs, right? The only built-in Internal Variable for Job really is "Job Name". Names are changeable and can break referential links.

I'm wondering if anyone here has logging best practices. And before its mentioned, many 'out of the box' logging -- for Pentaho or SSIS -- is not very good. It's good if you're short on time and quite lazy, and then you probably will never look at it or spend way too much time digging throug hit.

I guess the answer is to create a variable "Job ID" for each job. Thing is, this pretty manual (to make it unique). So when the name changes in the job, the ID is persisted like in much technology paradigms. I know there are 'batch IDs' but that's different.

Again, to really make this concrete, you could check a log table and say "when's the last time the Load_Call ran succesfully, and how long did it take?". If this is based on name alone --- names don't change frequently if at all for jobs, but they could change. Right?

Read data from PDF file and need to load into table

$
0
0
Hi,

there is a requirement on PDF data loads, we need to read the data from PDF file and need to load into table. i am not able to process the data in rows and columns wise because, when we transforming the data from PDF, data is transforming in undefined or unstructured way.


column headers and the respective rows are not populating in order way.

Could you please tell me how can we read the data with column headers and its respective data in proper way.


Thank you

Problem with Merge Rows (diff) step

$
0
0
I am having a problem comparing two fields using Merge Rows (diff) step. My "reference" row is only one field that contains a filename from a result set. The "compare" is a filename from a file directory. I just want to test if the filename from the "reference" row exists in the file directory. If I test this transformation on it's own substituting the result set with a "generate rows" step it works fine. When I run the job I get this error:

Quote:

2019/05/02 11:16:48 - cdr_file_from_directory.0 - TRIMMED FILE: cdr_StandAloneCluster-1_01_201905021745_3448712019/05/02 11:16:48 - Merge Rows (diff).0 - ERROR (version 7.1.0.0-12, build 1 from 2017-05-16 17.18.02 by buildguy) : Unexpected error
2019/05/02 11:16:48 - Merge Rows (diff).0 - ERROR (version 7.1.0.0-12, build 1 from 2017-05-16 17.18.02 by buildguy) : org.pentaho.di.core.exception.KettleException:
2019/05/02 11:16:48 - Merge Rows (diff).0 - Invalid layout detected in input streams, keys and values to merge have to be of identical structure and be in the same place in the rows
2019/05/02 11:16:48 - Merge Rows (diff).0 -
2019/05/02 11:16:48 - Merge Rows (diff).0 - We detected rows with varying number of fields, this is not allowed in a transformation. The first row contained 4 fields, another one contained 1 : [cdr_file String]
2019/05/02 11:16:48 - Merge Rows (diff).0 -
2019/05/02 11:16:48 - Merge Rows (diff).0 -
2019/05/02 11:16:48 - Merge Rows (diff).0 - at org.pentaho.di.trans.steps.mergerows.MergeRows.processRow(MergeRows.java:99)
2019/05/02 11:16:48 - Merge Rows (diff).0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2019/05/02 11:16:48 - Merge Rows (diff).0 - at java.lang.Thread.run(Thread.java:748)
2019/05/02 11:16:48 - Merge Rows (diff).0 - Caused by: org.pentaho.di.core.exception.KettleRowException:
2019/05/02 11:16:48 - Merge Rows (diff).0 - We detected rows with varying number of fields, this is not allowed in a transformation. The first row contained 4 fields, another one contained 1 : [cdr_file String]
2019/05/02 11:16:48 - Merge Rows (diff).0 -
2019/05/02 11:16:48 - Merge Rows (diff).0 - at org.pentaho.di.trans.step.BaseStep.safeModeChecking(BaseStep.java:2121)
2019/05/02 11:16:48 - Merge Rows (diff).0 - at org.pentaho.di.trans.steps.mergerows.MergeRows.checkInputLayoutValid(MergeRows.java:262)
2019/05/02 11:16:48 - Merge Rows (diff).0 - at org.pentaho.di.trans.steps.mergerows.MergeRows.processRow(MergeRows.java:97)
2019/05/02 11:16:48 - Merge Rows (diff).0 - ... 2 more
2019/05/02 11:16:48 - Merge Rows (diff).0 - Finished processing (I=0, O=0, R=2, W=0, U=0, E=1)
2019/05/02 11:16:48 - see_if_file_received_from_sftp - Transformation detected one or more steps with errors.
2019/05/02 11:16:48 - see_if_file_received_from_sftp - Transformation is killing the other steps!
2019/05/02 11:16:48 - see_if_file_received_from_sftp - ERROR (version 7.1.0.0-12, build 1 from 2017-05-16 17.18.02 by buildguy) : Errors detected!
2019/05/02 11:16:48 - select_only_one_field_from_file_directory.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0)
2019/05/02 11:16:48 - cdr_file_from_directory.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0)
2019/05/02 11:16:49 - see_if_file_received_from_sftp - ERROR (version 7.1.0.0-12, build 1 from 2017-05-16 17.18.02 by buildguy) : Errors detected!
This makes no sense as you can see there is only one field coming in from either side. How could there be 4? Also I thought I could attach a transformation here....

I have attached the transformation.

Table input step problem when using substring function on Firebird 3.x

$
0
0
Hi

I really need your help, have tested everything that comes to mind.

We have pdi job, that is loading data from Firebird, firebird was upgraded to version 3 (previously was 2).

I changed the drivers by replacing the firebird driver to data-integration/lib/jaybird-full-3.0.5.jar and i have the following problem now, see attached screenshot, the funny thing is i tested with two separate spoons, same version, extracted from sourceforge archive at different times but should be same setup with same firebird drivers, one works without problems the other one renames the field to SUBSTRING as you can see from the screenshot.

Same thing happens on live linux carte server, working on spoon/master server but not on centos server running carte. No matter which version or driver i use, have tried most of the major versions from 5-8.2 PDI.

https://imgur.com/QaxAw5e
https://imgur.com/cBBkr9C

Br

Dynamic row to column conversion

$
0
0
Hi,

I need to convert data from rows to columns dynamically. Let me explain it with example.

Say, source data is:-
Code:

select 'ABC-123' as company_id, 4343 amt from dual union all
select 'PQR-111' as company_id, 1111 amt from dual union all
select 'XYZ-222' as company_id, 2345 amt from dual union all
select 'DDD-333' as company_id, 9999 amt from dual union all
select 'IJK-444' as company_id, 1122 amt from dual union all
select 'KLM-555' as company_id, 3344 amt from dual union all
select 'BRT-666' as company_id, 5555 amt from dual union all
select 'IND-777' as company_id, 6666 amt from dual

I need excel output where company_id will become header and amt will become corresponding value under the header. Here is the expected output (excel):-
ABC-123 PQR-111 XYZ-222 DDD-333 IJK-444 KLM-555 BRT-666 IND-777
4343 1111 2345 9999 1122 3344 5555 6666

Number of rows in source data is not fixed. In sample data, it is having 8 rows and hence, excel output is having 8 columns. However it should be dynamic. If source data is having 20 rows, then I should be having 20 columns in excel output.

Request you all to take a look. Any help will be much appreciated.

Regards,
Ritesh

Help on pdi display on laptop

$
0
0
Hi
I am seeking some help please on the visual elements of the spoon pdi. Using Data Integrator 8.1


I have a desktop and a laptop. When I open jobs or transforms on the desktop they look fine - like they always have.
But when the same file is opened on the laptop the icons and text tend to be on top of each other.
Sure I could just tidy it up but I have hundreds of these.


I had a similar issue with squirrel SQL (also a java app) and the explanation there was something about swing and high dpi monitors not playing well together.
The laptop is a Microsoft one. The base resolution on laptop screen is 2256x1504 but there are two conventional 1920x1080 screens as well and this is where I am working with Spoon


I have fiddled with the fonts in Tools->Options->Look & Feel but cant seem to improve things
Does anybody know of any other switches or settings that could alleviate this ?


Thanks
JC
Attached Images
  

How to Get Max from different Column

$
0
0
Hi All, Question please

I got a data with column a, b, c
which have value(integer) in it

1,3,5
4,9,10
11,5,8
2,88,1

how can i make pentaho to get the max value of these row?

1,3,5 result 5
4,9,10 result 10
11,5,8 result 11
2,88,1 result 88

thanks in advance
Viewing all 16689 articles
Browse latest View live