hot to add a static value to PRD prompts

January 20, 2014, 7:18 am

≫ Next: Read incrementally (many lines per time) a huge file in every execution

≪ Previous: Mondrian database error after usage lull.

we need to add a row to the return set of a MQL query, used to populate a parameter drop down.

eg:

MQL query returns:

EUROPE
ASIA
AMERICA
AUSTRALIA
AFRICA

Drop down must be populated with values:

ALL VALUES <------------------
EUROPE
ASIA
AMERICA
AUSTRALIA
AFRICA

direct SQL cannot be used in this context because we have to use metadata to take into consideration also user rights and data visibility.

Is there a viable way to implement this?

Thank you
Virgilio

↧

Read incrementally (many lines per time) a huge file in every execution

January 20, 2014, 7:56 am

≫ Next: Stopping the report mid way if it's taking long time?

≪ Previous: hot to add a static value to PRD prompts

Hi everyone,

I need to read a huge csv file and I need to do it in differents executions. So, my idea is read the first 10.000 lines in the first execution, in the second one read from 10.001 until 20.000 and so on.

I´ve been trying to use Set Variables step , where I set the default value of two Params (init_value = 0, end_value=10000) whose are defined in the Parameters sections of the Transformation, to update that default values of this parameters to 10001 to 20000 and so on in each execution.

Then, using the Sample Rows step, I try to read the lines of the file using that parameters in this way:${init_value}..${end_value} , the first time it works with that values (0,10000) but then I don´t know how to update that values to (init_value=10001,end_value=20000) and so on.

Someone have any idea how I could do this or a new idea that I could use to read the file in this way?
I appreciate any idea!

Thanks!!

↧

Stopping the report mid way if it's taking long time?

January 20, 2014, 8:31 am

≫ Next: PDI + CDH4.4.0 : Unable to get VFS File object for filename

≪ Previous: Read incrementally (many lines per time) a huge file in every execution

Hi Taqua,

I just want to know whether there is any possibility to stop the report execution mid way if the process is taking long-time after the report has started? I am using PRD 3.9 with postgresql database. Is it possible to stop the report during processing?

Thanks in advance,

Raj

↧

PDI + CDH4.4.0 : Unable to get VFS File object for filename

January 20, 2014, 8:34 am

≫ Next: PDI + Hadoop: Permission Denied to copy file

≪ Previous: Stopping the report mid way if it's taking long time?

I am trying to use PDI + CDH4.4.0.

My cluster hadoop is woking fine.

When I try to copy files to the cluster I get the following error:

Code:

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version  4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : Não é  possível copiar a pasta/o arquivo [d:/weblogs_rebuild.txt] a  [hdfs://10.239.69.200:8020/test]. Excepção: [

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : 

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) : Unable to get VFS  File object for filename 'hdfs://10.239.69.200:8020/test' : Could not  resolve file "hdfs://10.239.69.200:8020/test".

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : 

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : ]

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :  org.pentaho.di.core.exception.KettleFileException: 

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : 

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) : Unable to get VFS  File object for filename 'hdfs://10.239.69.200:8020/test' : Could not  resolve file "hdfs://10.239.69.200:8020/test".

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : 

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : 

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:161)

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:104)

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.ProcessFileFolder(JobEntryCopyFiles.java:376)

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.execute(JobEntryCopyFiles.java:324)

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.job.Job.execute(Job.java:589)

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.job.Job.execute(Job.java:728)

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.job.Job.execute(Job.java:443)

2014/01/20 14:19:50 - Hadoop Copy Files - ERROR (version 4.4.0-stable,  build 17588 from 2012-11-21 16.02.21 by buildguy) :     at  org.pentaho.di.job.Job.run(Job.java:363)

Hadoop Packages on the Server:

Code:

hadoop-hdfs-namenode-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

hadoop-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

hadoop-mapreduce-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

hadoop-0.20-mapreduce-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

hadoop-hdfs-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

hadoop-yarn-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

hadoop-client-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

hadoop-0.20-mapreduce-jobtracker-2.0.0+1475-1.cdh4.4.0.p0.23.el6.x86_64

I tried to use the following PDI Packages:

Code:

PDI 4.4.0-stable;

PDI 4.4.0-stable + Big-Data-Plugin Version 1.3.3.1; and

PDI 5.0.1-stable

All the 3 options got the same message error.

And I can see on hadoop log:

Code:

2014-01-20 08:24:27,686 WARN org.apache.hadoop.ipc.Server:  Incorrect header or version mismatch from 10.239.69.20:53593 got version  3 expected version 7

Does anyone may help me ?

Regards.

↧

PDI + Hadoop: Permission Denied to copy file

January 20, 2014, 10:17 am

≫ Next: is there any current or planned support for Redshift using Mondrian?

≪ Previous: PDI + CDH4.4.0 : Unable to get VFS File object for filename

How can I define user/password to be used on 'Hadoop Copy Files' step ?

I'm trying

Code:

hdfs://sicat:sicatuser@10.239.69.200:8020/user/sicat

But I'm getting the following error message :

Code:

2014/01/20 15:59:33 - Hadoop Copy Files - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : File System Exception: Could not copy "file:///d:/weblogs_rebuild.txt" to "hdfs://sicat:***@10.239.69.200:8020/user/sicat/weblogs_rebuild.txt".

2014/01/20 15:59:33 - Hadoop Copy Files - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : Caused by: Could not write to "hdfs://sicat:***@10.239.69.200:8020/user/sicat/weblogs_rebuild.txt".

2014/01/20 15:59:33 - Hadoop Copy Files - ERROR (version 5.0.1-stable, build 1 from 2013-11-15_16-08-58 by buildguy) : Caused by: Permission denied: user=kleysonrios, access=EXECUTE, inode="/user/sicat":sicat:hadoop:drwxr-x---

    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224)

    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:177)

    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:142)

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4705)

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4687)

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4661)

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1839)

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1771)

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1747)

    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:439)

    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:207)

    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44942)

    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)

    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)

    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)

    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.Subject.doAs(Subject.java:396)

    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

So, PDI is trying to use my windows username instead of username defined on URL.

Regards.

↧

is there any current or planned support for Redshift using Mondrian?

January 20, 2014, 10:30 am

≫ Next: How to setup Pentaho CE with LDAP using bind authentication

≪ Previous: PDI + Hadoop: Permission Denied to copy file

looking for any information on whether Mondrian can be used to query AWS Redshift? Although Redshift is a Postgres cluster, looks like there are some query differences that would likely impact the use of Mondrian/MDX library.

thanks!

Charles

↧

How to setup Pentaho CE with LDAP using bind authentication

January 20, 2014, 11:17 am

≫ Next: MDX Query Help - Or cube concept?

≪ Previous: is there any current or planned support for Redshift using Mondrian?

Hi,

I am trying to setup Pentaho CE to using LDAP authentication. I have been successful in setting up the authentication. However I am currently hard-coding the contextSource.userDn and contextSource.password in applicationContext-security-ldap.properties as below.

contextSource.userDn=uid\=admin,ou\=system
contextSource.password=secret

Our company policy does not allow us to hard code the user and password. Also we cannot have a common read-only user for authentication. My requirement is to use the login and password provided on the PUC to authenticate the user. How I can I achieve this? I want something like below:

contextSource.userDn=uid\=PUC_LOGIN,ou\=system
contextSource.password=PUC_PASSWORD

Is something like below possible?

contextSource.userDn=uid\={0},ou\=system
contextSource.password={1}

Thank You.

Regards,
Latif

↧

MDX Query Help - Or cube concept?

January 20, 2014, 11:20 am

≫ Next: Upgrade to mondrian 3.6.1 inside BI-Server 4.1

≪ Previous: How to setup Pentaho CE with LDAP using bind authentication

Hello,

I have a fact table for sales, and a dimension table Client.

I need to create a report where i can show How many clients Joined in per year. Now, i have a the Time dimension and a join_date in Client (which DO NOT link to the Time Dimension).
If i create a cube adding Join_date as a Level of the Client Table, i wouldn't know how to filter that. Do anyone know how can I achieve this result?

Thanks.

↧

Upgrade to mondrian 3.6.1 inside BI-Server 4.1

January 20, 2014, 12:05 pm

≫ Next: socket creation error

≪ Previous: MDX Query Help - Or cube concept?

Is it possible to upgrade to mondrian 3.6.1 inside BI-Server 4.1 or is it only designed to work with Petaho 5.0.1 ?

We have found a bug that seems to be fixed in 3.6.1 (see http://jira.pentaho.com/browse/MONDRIAN-1485).

Simple replacing mondrian.jar does not work. Errors concerning Mondran.olsp.Util class not found, other classes seem to be needed ...

Thank you !

Hugues

↧

socket creation error

January 20, 2014, 12:10 pm

≫ Next: Get files with SecureFTP using proxy

≪ Previous: Upgrade to mondrian 3.6.1 inside BI-Server 4.1

Hi,

I'm trying to connect to a MySQL database using JNDI, but I'm getting this error message:

"org.pentaho.di.core.exception.KettleDatabaseException: Error occured while trying to connect to the database

Invalid JNDI connection GLPI : socket creation error"

Could anybody help-me. Here goes some more details about my configuration environment:

Mysql driver located at "data-integration/libext/JDBC"
mysql-connector-java-5.1.26-bin.jar

JNDI jdbc.properties file:

GLPI/type=javax.sql.DataSource
GLPI/driver=com.mysql.jdbc.Driver
GLPI/user=glpi-teste
GLPI/password=****
GLPI/url=jdbc:mysql://IPADRESS:3306/glpi-teste

JNDI connection and error screen:
GLPI-JNDI - Error Message.jpg

Attached Images

GLPI-JNDI.jpg (20.1 KB)
GLPI-JNDI - Error Message.jpg (35.5 KB)

↧

Get files with SecureFTP using proxy

January 20, 2014, 1:29 pm

≫ Next: How Different is Pentaho EE Compared to Pentaho CE?

≪ Previous: socket creation error

I'm using the Get files with SecureFTP job entry behind a proxy server. In another environment without the proxy server, the step works correctly. Behind the proxy, I get the following error:

Error trying to connect to xxxxx .Exception :
com.jcraft.jsch.JSchException: ProxyHTTP: java.io.IOException: proxy error: Forbidden

The proxy configuration uses an HTTP proxy on port 3128. The same configuration in WinSCP works fine.

Is this a bug in the "Get files with SecureFTP" job entry? This might be caused by the proxy itself. Not sure why the proxy would allow the traffic from WinSCP but not from JSch.

↧

How Different is Pentaho EE Compared to Pentaho CE?

January 20, 2014, 4:27 pm

≫ Next: How to map an arising output with a table with no common columns?

≪ Previous: Get files with SecureFTP using proxy

I am test driving Pentaho for my company. So far I really like the CDE (Dashboard) and Kettle feature.

My question is, since I'm test driving Pentaho CE and if I would like my company to use Pentaho EE, how hard is the transition? I've built some Kettle Jobs/Transformation and CDE dashboards and have gotten pretty used to the workflow and process. Will there be a new learning curve if I use Pentaho EE?

↧

How to map an arising output with a table with no common columns?

January 20, 2014, 9:41 pm

≫ Next: Adding Date in Mail subject | Mail scheduled from biserver ce-5.0.1 scheduler

≪ Previous: How Different is Pentaho EE Compared to Pentaho CE?

I have a ktr in which i have filtered certain unnecessary columns after mapping (Joining) 5 tables with a common column name.

Hence the end of the KTR consists of certain columns that have passed some STEPS like row filter, etc.

Now I want the OUTPUT arising to be Mapped with a single (NEW And not previously mapped) table having NO similar columns (With the arising o/p), and generate a few columns that this new table has, ALONGWITH the generated o/p.

How do i go about it?

↧

Adding Date in Mail subject | Mail scheduled from biserver ce-5.0.1 scheduler

January 20, 2014, 11:33 pm

≫ Next: .prpt output with present date | Information required

≪ Previous: How to map an arising output with a table with no common columns?

Hi All,

Please help me in implementing the date in the "mail subject" in the mailer scheduled from Scheduler in pentaho biserver-ce.5.0.1

As of now I have schduled report with attached report which has the Subject as " Daily Domestic Report"

I want to send the mail with subject as " Daily Domestic Report {parameter}" where parameters should be yesterday's date.

Can one please guide me how to implement this.

Thanks in advance.

Malibu

↧

.prpt output with present date | Information required

January 20, 2014, 11:36 pm

≫ Next: Sql-Query information required in Kettle transformation

≪ Previous: Adding Date in Mail subject | Mail scheduled from biserver ce-5.0.1 scheduler

Hi All,

Please help me in implementing the date in the "prpt output" in the mailer scheduled from Scheduler in pentaho biserver-ce.5.0.1 which takes .prpt file as input.

As of now I have schduled report with attached report which has the Subject as " Daily Domestic Report"

I want to send the mail with subject as " Daily Domestic Report" where the attachment name should contain the Date in the output report mailed to the end users
Can one please guide me how to implement this.

Thanks in advance.

Malibu

↧

Sql-Query information required in Kettle transformation

January 21, 2014, 12:47 am

≫ Next: Update table data with value in variable

≪ Previous: .prpt output with present date | Information required

Hi All,

I have sql query select paymentnumber_id from payment_details_customers where status="active" ;

the paymentnumber_id retrieved is to be used as parameter in another query like the below statement

select card_number, property_number, customer_name from customer_survey_details where paymentnumber_id=${paymentnumber_id}

How to achieve this in pentaho kettle.

Can some one please guide me.
It will be a great help.

thanks,
malibu

↧

Update table data with value in variable

January 21, 2014, 2:16 am

≫ Next: Pentaho Report Performance

≪ Previous: Sql-Query information required in Kettle transformation

Hi Guys!!

I am new to Kettle and getting an issue while passing the value of variable in Execute SQL Script task. Below is the image of the kettle I am trying:

1.jpg

In "Set variable", I am assigning the output of "Table Input 2" to variable CURRENTTS and then based on this ${CURRENTTS} value I am able to get the data from "Table input" step and then the data is getting populated into other table using "Table Output".
Now after "Table output" I want to update the value of variable ${CURRENTTS} into a table for which I used "Get variable" & "Execute SQL Script" but at this point ${CURRENTTS} becomes NULL.

Please suggest me ways to update the variable value in the table.

Attached Images

1.jpg (15.3 KB)

↧

Pentaho Report Performance

January 21, 2014, 2:22 am

≫ Next: Possibilie of integrate the result of weka's analysis on a web application

≪ Previous: Update table data with value in variable

Hi all,

I am genereting reports via PUC (biserver-ce-4.8.0) but its taking 4 or 5 minutes.

Can someone help me to increase the performance for report generation please.

Thank your advice.

↧

Possibilie of integrate the result of weka's analysis on a web application

January 21, 2014, 2:28 am

≫ Next: Kettle 5.0.1 : Log4j plugin usage

≪ Previous: Pentaho Report Performance

Je veut utiliser weka pour faire des analyses sur des factures et extraire des statistiques que je veut les afficher dans une application web.
Est ce que weka offre une possibilité d'intégrer ses résultats dans le web? quel est la solution ?
Merci :)

I want to use weka analyzes on bills in order to extract statistics that I want to display it on a web application.
Does weka give the possibility to integrate its results into the web? what is the solution?
thank you :)

↧

Kettle 5.0.1 : Log4j plugin usage

January 21, 2014, 3:56 am

≫ Next: Significant performance difference in executing a transformation locally or via Carte

≪ Previous: Possibilie of integrate the result of weka's analysis on a web application

Hi,

I am trying to use log4j logging with kettle 5.0.1. Here is a link where Matt pointed to using a plugin to route the logging to log4j. However, i couldnt find any details on how to use it.

So far,
- Checked out code from git hub and tried to look into the commit log to understand the purpose/usage of kettle5-log4j-plugin project. But couldnt.
- Googled internet and this forum but no details either
- Debugged kettle 5.0.1 code to find a way to plugin log4j logging from kettle5-log4j-plugin. But couldnt find a way
- Any attempts to using Log4jLogging are in vain.

Please note that i was able to make my java code takeup my log4j.xml instead of the one from kettle-engine.jar. With that commons-vfs code is getting logged into the configured log4j appender. But no effect on kettle code. Tried both configuration via log4j.xml and programmatic log4j appenders using "org.pentaho.di" It still uses default logging.

Also, any changes done to the kettle appender is not making any impact on the resultant logging. For ex. following code doesnt make any difference

Code:

Logger logger = Logger.getLogger(LogWriter.STRING_PENTAHO_DI_LOGGER_NAME);

logger.setLevel(Level.ERROR);

Also, there seems to be no example/article/wiki page what so ever on log4j integration.

Following is the plugin configuration i used in "kettle-logging-plugins.xml". The plugin configuration is read and added to the plugin list. But the pkugin class (i.e. Log4jLogging) was never called.

Code:

<?xml version="1.0" encoding="UTF-8"?>

<logging-plugins>

    <logging-plugin id="log4j">

        <description>custom log4j plugin</description>

        <classname>org.pentaho.di.core.logging.log4j.Log4jLogging</classname>

    </logging-plugin>

</logging-plugins>

The log clearly shows that the plugin is registered but the eventAdded() is never called

Code:

General - Plugin class org.pentaho.di.core.logging.log4j.Log4jLogging registered for plugin type 'Logging Plugin'

Could any one please point me if i missed the documentation ? Suggestions are welcome too.

Thanks in advance,
Rakesh

↧