Quantcast
Channel: Pentaho Community Forums
Viewing all 16689 articles
Browse latest View live

FTP file retrieval issue

$
0
0
Hello.

We use PDI for retrieval of EDI files from a value-added network (VAN). When we retrieve the files via FTP, the VAN automatically moves the file being downloaded to a "prior" file to prevent it from being downloaded twice.

When we migrated our EDI jobs from PDI 4.3 to PDI 5.3, we started noticing that they were failing. It looks like, when run in 5.3, at the beginning of the job, PDI touches the file on the FTP server (without downloading it) in such a way as to make the VAN believe the file has been downloaded, causing the VAN to move the file to it's "prior" status. When the transform goes to get the file and actually use its contents, the file specified in the transform is gone (actually empty), which means the file specified in the step can no longer be found for use by the transform. The whole thing fails silently, since the file was found, but was empty - so there's no error to be reported.

Moving the transform back to 4.3 solves the issue. We attempted to use PDI 6, but discovered that Kitchen takes a long time to start up, so we have not been able to test this issue in the latest version to see if the FTP problem has been solved.

Are there any settings that were present in 5.x that we need to look for that might resolve this? Has this been addressed in the version change, and we need to start working with 6? New versions aren't usually a problem, except when things like this come up - so we tend to hang onto the old versions longer than we probably should. We are just asking this question now because we are trying to consolidate to one working version for our environment.

Thanks.

-Kevin

PDI 6.1 text file output step cannot find file location

$
0
0
I have a transformation that runs inside of a job that creates a text file on a remote LINUX server.
We have migrated this transformation from PDI 4.8 and it works.
When using PDI 6.1.0.8 from spoon and kicking off the job to run on our PDI server
(the same setup as production) the job fails with the error below.
Anyone know what might be the problem? -- we've checked directory and file permissions.
The folder is there, but of course it would not be on a C drive.
It's almost if it thinks the directory is on the local machine...but that just may be a logging misprint...

Thanks..
0:46 - Text file output 2.0 - Released server socket on port 0
2016/12/14 12:00:46 - Text file output.0 - We can not find parent folder [file:///C:/server/pentaho/pentaho/REPORT]!
2016/12/14 12:00:46 - Text file output.0 - ERROR (version 6.1.0.8-268, build 1 from 2016-11-29 10.00.09 by buildguy) : Couldn't open file file:///C:/server/pentaho/pentaho/REPORT/VH.csv
2016/12/14 12:00:46 - Text file output.0 - ERROR (version 6.1.0.8-268, build 1 from 2016-11-29 10.00.09 by buildguy) : org.pentaho.di.core.exception.KettleException:
2016/12/14 12:00:46 - Text file output.0 - Error opening new file : org.apache.commons.vfs2.FileSystemException: Could not create folder "file:///C:".
2016/12/14 12:00:46 - Text file output.0 -
2016/12/14 12:00:46 - Text file output.0 - at org.pentaho.di.trans.steps.textfileoutput.TextFileOutput.openNewFile(TextFileOutput.java:655)
2016/12/14 12:00:46 - Text file output.0 - at org.pentaho.di.trans.steps.textfileoutput.TextFileOutput.init(TextFileOutput.java:755)
2016/12/14 12:00:46 - Text file output.0 - at org.pentaho.di.trans.step.StepInitThread.run(StepInitThread.java:69)
2016/12/14 12:00:46 - Text file output.0 - at java.lang.Thread.run(Thread.java:745)
2016/12/14 12:00:46 - Text file output 2.0 - We can not find parent folder [file:///C:/server/pentaho/pentaho/REPORT]!
2016/12/14 12:00:46 - Text file output 2.0 - ERROR (version 6.1.0.8-268, build 1 from 2016-11-29 10.00.09 by buildguy) : Couldn't open file file:///C:/server/pentaho/pentaho/REPORT/ARO.csv
2016/12/14 12:00:46 - Text file output 2.0 - ERROR (version 6.1.0.8-268, build 1 from 2016-11-29 10.00.09 by buildguy) : org.pentaho.di.core.exception.KettleException:
2016/12/14 12:00:46 - Text file output 2.0 - Error opening new file : org.apache.commons.vfs2.FileSystemException: Could not create folder "file:///C:".
2016/12/14 12:00:46 - Text file output 2.0 -
2016/12/14 12:00:46 - Text file output 2.0 - at org.pentaho.di.trans.steps.textfileoutput.TextFileOutput.openNewFile(TextFileOutput.java:655)
2016/12/14 12:00:46 - Text file output 2.0 - at org.pentaho.di.trans.steps.textfileoutput.TextFileOutput.init(TextFileOutput.java:755)
2016/12/14 12:00:46 - Text file output 2.0 - at org.pentaho.di.trans.step.StepInitThread.run(StepInitThread.java:69)
2016/12/14 12:00:46 - Text file output 2.0 - at java.lang.Thread.run(Thread.java:745)

Failed to connect to a DI Server Instance

$
0
0
hi, i am fresher to Pentaho-di. My issue is when i try to connect repository by Tools->Repository->connect i get an Error "Failed to connect to a DI Server Instance. Please check your server connection information and make sure your server is running." When I edit my repository and test it, it says
Repository URL is not correct. Caused by: Failed to access the WSDL at: http://localhost:8080/pentaho-di/web...epository?wsdl.
It failed with: Got http://localhost:8080/pentaho-di/web...epository?wsdl while opening stream from http://localhost:8080/pentaho-di/web...epository?wsdl.
Your earliest response will be highly appreciated.

Getting data from Web Service - how to stop when there's no more data

$
0
0
Here's the use case: our PDI job polls a web service (ws) to retrieve a near-real-time data feed. We call the ws and a number of rows are returned in an xml file. If the number of rows returned equals the maximum allowable batch size, we need to run the job again, until the number of rows returned is less than the batch size. This is the only way to know if we have gotten all the current data.

How can we loop around a job while a condition is true (rows returned = batch size)?

Many thanks
- Russell

How to create a new JDBC ORACLE RAC in the option Manage Data Sources

$
0
0
Hi,

I am trying to configure a connection to an oracle RAC without success in Pentaho BI-server 6.1. How can I make it work?

The system is building the jdbc url like this (which is causing the ):

jdbc:thin:@:: (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.76.30)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = GNCV) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5))))

I will appreciate your help.

Thanks!

Java Filter Error

$
0
0
Hi,
in my java filter condition i have this condition:
(fk_plmn != fk_plmn_LKP) or (A!=B) or......

but i have this error:

Java Filter.0 - org.codehaus.janino.Parser$ParseException: Line 1, Column 9: Expression "fk_plmn != fk_plmn_LKP" is not a type
2016/12/15 18:00:56 - Java Filter.0 - Line 1, Column 9: Expression "fk_plmn != fk_plmn_LKP" is not a type

What's the problem?

thanks
Regards

Text File Output Header Appending

$
0
0
I'm trying to append the text output file but seems like header is missing, so what I did for this,

  1. Checked if the file exits
  2. If it doesn't exist Icreated and initial CSV file
  3. If it existed then read that CSV into a Serialize file
  4. And then appended the read file and the transformed file


But the issue is I cannot have a select file after reading the file as my columns are dynamic some times it has 2 and somtimes it has more than 10, so basically it varies. Now, while appending i"m getting an error stating
Code:

We detected rows with varying number of fields, this is not allowed in a transformation.  The first row contained 0 fields, another one contained 6 :
Please help me on how to achieve this.

User Defined Java Class - how to read a Number (decimal) input value

$
0
0
Hi all,


I have a Number field that contains values with a precision of 2.


I am having trouble reading these fields in in my User Defined Java Class step.


I have tried...
Double incomingDecimalValue = get(Fields.In, "the_decimal_field").getDouble(r);


The error returned is...
"A method named "getDouble" is not declared in any enclosing class nor any supertype, nor through a static import"


How do I pull in a decimal (Number) value into my Java code?


Conversely, I am able to collect incoming integer values with...
Long incomingDecimalValue = get(Fields.In, "the_decimal_field").getInteger(r);




Cheers,


Stanbridge

Java API get resultFiles from subtransformation

$
0
0
I have several PDI transformations executed by a Java application. The tranformations share one subtransformation (mapping), which, after processing of the data, saves the output as files, also adding the filenames to result. In my Java API, I need to get the resultFiles saved in the subtransformation, but I am not able to. I can only get the resultFiles saved in transformations called directly by Java API, but not the subtransformations. Is that possible? Thank you for any help or suggestions.

Unable to reach dashboard page after login

$
0
0
Hi,

I'm using pentaho 4.8.0 on an Ubuntu machine.
When login to the biserver, after i'm getting blank page
its not throwing any error in browser, just checked the tomcat logs it throws error like this,

13:36:37,812 ERROR [Logger] misc-class org.pentaho.platform.plugin.services.pluginmgr.DefaultPluginManager: PluginManager.ERROR_0011 - Failed to register plugin cdv
java.lang.NoClassDefFoundError: Could not initialize class com.orientechnologies.orient.core.version.OVersionFactory
at com.orientechnologies.orient.core.config.OStorageConfiguration.create(OStorageConfiguration.java:412)
at com.orientechnologies.orient.core.storage.impl.memory.OStorageMemory.create(OStorageMemory.java:101)
at com.orientechnologies.orient.core.db.raw.ODatabaseRaw.create(ODatabaseRaw.java:127)
at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.create(ODatabaseWrapperAbstract.java:53)
at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.create(ODatabaseRecordAbstract.java:168)
at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.create(ODatabaseWrapperAbstract.java:53)
at com.orientechnologies.orient.core.db.ODatabaseRecordWrapperAbstract.create(ODatabaseRecordWrapperAbstract.java:58)
at com.orientechnologies.orient.server.OServer.loadStorages(OServer.java:527)
at com.orientechnologies.orient.server.OServer.loadConfiguration(OServer.java:469)
at com.orientechnologies.orient.server.OServer.startup(OServer.java:166)
at com.orientechnologies.orient.server.OServer.startup(OServer.java:157)
at pt.webdetails.cpf.persistence.PersistenceEngine.startOrient(PersistenceEngine.java:617)
at pt.webdetails.cpf.persistence.PersistenceEngine.initialize(PersistenceEngine.java:92)
at pt.webdetails.cpf.persistence.PersistenceEngine.<init>(PersistenceEngine.java:72)
at pt.webdetails.cpf.persistence.PersistenceEngine.getInstance(PersistenceEngine.java:59)
at pt.webdetails.cdv.CdvLifecycleListener.reInit(CdvLifecycleListener.java:56)
at pt.webdetails.cdv.CdvLifecycleListener.init(CdvLifecycleListener.java:52)
at org.pentaho.platform.plugin.services.pluginmgr.PlatformPlugin.init(PlatformPlugin.java:189)
at org.pentaho.platform.plugin.services.pluginmgr.DefaultPluginManager.registerPlugin(DefaultPluginManager.java:199)
at org.pentaho.platform.plugin.services.pluginmgr.DefaultPluginManager.reload(DefaultPluginManager.java:128)
at org.pentaho.platform.plugin.services.pluginmgr.PluginAdapter.startup(PluginAdapter.java:42)
at org.pentaho.platform.engine.core.system.PentahoSystem.notifySystemListenersOfStartup(PentahoSystem.java:342)
at org.pentaho.platform.engine.core.system.PentahoSystem.notifySystemListenersOfStartup(PentahoSystem.java:324)
at org.pentaho.platform.engine.core.system.PentahoSystem.init(PentahoSystem.java:291)
at org.pentaho.platform.engine.core.system.PentahoSystem.init(PentahoSystem.java:208)
at org.pentaho.platform.web.http.context.SolutionContextListener.contextInitialized(SolutionContextListener.java:137)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4135)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4630)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:546)
at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637)
at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277)
at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321)
at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:785)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:445)
at org.apache.catalina.core.StandardService.start(StandardService.java:519)
at org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
at org.apache.catalina.startup.Catalina.start(Catalina.java:581)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)


Could you please help on this..

Exception: com.jcraft.jsch.JSchException

$
0
0
Hi all.

I am trying to use the SFTP step to put some files into a server.

I tested my connection from within a DOS window with psftp.exe and it worked fine. And form WINSCP also worked fine.

But when I insert the parameters into the Pentaho step SFTP, I got the following message:
Error trying to connect to xxx.xxx.xxx.xxx. Exception: com.jcraft.jsch.JSchException: Auth cancel Auth cancel.
Error connection on xxx.xxx.xxx.xxx

Any idea ?

Thanks in advance.

Guillermo :)
Buenos Aires, Argentina

Pentaho 7.0 and cascading parameters for Interactive reports

$
0
0
Dear all,
I am trying to evaluate Pentaho 7.0 Enterprise edition and get mad with creation of interactive reports with cascading parameters.
I have a set of parameters (Query prompts) which are linked to each other:
In my example:
1. Select a specific custodian / bank
2. Second parameter should show me a list of filtered bank portfolios (belonging to the selected bank).
I found a document how to set up cascading parameters.
However, if I select one parameter in the data editor I can only choose as data type between two options:
* Meta data
* static list
I miss the option "SQL query" which could be accroding to the instructions be used to pass a parameter value from a different field to the query.
Anyone does have an idea where this option could be available?
Thank you very much for your help,
Stefan

BI 7 tree=http 400 bad request

$
0
0
For a new installation I am getting a 400 bad request when I browse file?
Anyone else had this problem?
I installed BI and copied the webapps folder to a freshly installed Tomcat V8.0

Ues parameter value in new field

$
0
0
I am exploring Kettle as a new user.

I have set a parameter in a transformation, and I want to use the parameter value in a new derived field. I have looked at both Calculator and Formula, but can't find any idea to get the value in the parameter down to a new field.

What is the best way to approach this? Do I need to write java/java script to accomplish this?

What if I want to use it as part of the condition? For example for something like
if input_field_1 = ${parameter_value}
then set to 1
else set to 2.

This derivation will be performed at row level.

Thanks

L

Transformation Step's CHANNEL_ID Field

$
0
0
One of the fields available to logging tables containing a transformation's step data is the channel_id field. Sample data from my log table include


  • 2ec6985d-a80e-4e5e-8b04-c374a8e98731
  • c27373e6-b34d-480a-97be-c2f9326ca39f
  • 5f285349-5e38-41d4-9bfb-d494fbe59fbf
  • 6ed8452a-04b2-4272-ae71-2bfcd260de9e


I am logging only the transformation's step data. My transformations output audit tables that I'd like to tie back to the step from which they originate. I am currently guessing at this relationship using timestamps; however, I'd like to use the channel_id since it appears to be a unique identifier.

My question is, how can I get at the step's channel number from within the running transformation? Pardon the question if this is obvious, I've not found anything in searching for this info.

Thank you,
JW

Kettle SalesForce Insert- Error

$
0
0
Hi All,

I'm receiving the attached error message while trying to insert a record into Salesforce object, I'm confident that I have right data type for the input fields and both module filed/stream input field have same name.

I'm using the latest version 7.0.

Any idea what I'm missing here to complete my transformation?

Opportunity_Issue.jpg
Attached Images

Newbie looking for PDI assistance: Regex string split & 2 other issues

$
0
0
Hi,

I've discovered PDI/Spoon for the first time this weekend and I like it. I'd like to use this tool to replace a who load of my crappy vba transformation and parsing code. The workflow, labeling and visibility of the effects of each step in PDI are something I really admire.

I've been replicating my work on one of the "easy" Excel files from the 60 or so I transformed and normalised for a recent project (The purpose was to clean up files before putting them in a homogenized, consolidated, reconciled database). I made quite good progress in a day today using PDI/Spoon butI have 3 requirements I'd like some guidance on in terms of how to replicate using Spoon.

1. Split a string field based on the first occurrence of a space i.,e ' '
I think need to go to school regarding Regex. I've skirted this a few times in the last few years but I probably need to actually understand it now. I tried using "replace in String", Y" to use regex & '/[^ ]*/' in search but I'm not having any luck. What should I be doing differently?

2. "Decumulation"
I've invented a new word - sorry about that. I've got a few files from this project which show for a collection of accounts, balances that are non-zero cumulative instead of "normal" incremental. I'd instead like them to be "normal" incremental. I tried exporting to a H2 table and then importing from the same H2 table to get to "execute SQL script" but since then discovered the execute SQL is not designed for data manipulation. That was my "get out of jail" move... So I need another way to strip out the cumulative effect from a time series of numbers. How else can I do this?

To make this question harder- for some I received certain account lines had cumulative data Except that in between some months the account value fell to zero (sweeps,manual entries etc). So I would want to say "take the current month value for x account away from the last non-zero occurrence". Does this take me into javascript, and can it be done in spoon?

3. "Pushdown"/"Pushup"
Sorry - another made up word. Client accounting data is basically pretty ****ty. A pattern I saw in more than one of these files was that there are certain lines so important you want to push these values down (in a new field) until you hit another value you know about, and then push down that. For instance you have heading "Revenues" and you want to tag the small accounts that sit below this as belonging to "Revenues". The next important line item you get to is "Cost of Goods Sold" and you then want to push this value down in the same new field until you reach say "R&D Expenses". I use the word "Pushup" to describe the scenario where those key values in the file from the client sit below the line items you want to tag - meaning you start from the bottom and work up. Basically you want to use one or the other approaches for any given file like that.


Any assistance appreciated


Regards,


Andy

Step Performance Logging

$
0
0
Has anyone had luck with the logging tables for monitoring step performance? At a basic level, we would like to know how long each step/job entry is taking.

I have enabled the logging tables. I can get start and stop and start time for the entire job but I'm having trouble getting the times for each entry/step.

I have also turned on the performance table for steps. I may be able to get times from this table but i think I figure out when the step starts processing its first row and when it has processed them all.

Anyone have some queries based on these tables you have used for reporting? That might give me a better understanding of what is going on.

Thanks
Sean

New WEKA releases: 3.6.15, 3.8.1 and 3.9.1

$
0
0
Hi everyone!

New versions of Weka are available for download from the Weka homepage:

* Weka 3.8.1 - stable version. It is available as ZIP, with Win32 installer, Win32 installer incl. JRE 1.8.0_112, Win64 installer, Win64 installer incl. 64 bit JRE 1.8.0_112 and Mac OS X application with Oracle 64 bit JRE 1.8.0_112.

* Weka 3.9.1 - development version. It is available as ZIP, with Win32 installer, Win32 installer incl. JRE 1.8.0_112, Win64 installer, Win64 installer incl. 64 bit JRE 1.8.0_112 and Mac OS X application with Oracle 64 bit JRE 1.8.0_112.

* Weka 3.6.15 - stable book 3rd edition version. It is available as ZIP, with Win32 installer, Win32 installer incl. JRE 1.8.0_112, Win64 installer, Win64 installer incl. 64 bit JRE 1.8.0_112 and Mac OS X application with Oracle 64 bit JRE 1.8.0_112.

Stable 3.8 receives bug fixes and new features that do not include breaking API changes and maintain serialized model compatibility. 3.9 (development) receives bug fixes and new features that might include breaking API changes and/or render models serialized using earlier versions incompatible.

NOTE: 3.6.15 is the final release of stable-3-6.

Weka homepage:
http://www.cs.waikato.ac.nz/~ml/weka/

Pentaho data mining community documentation:
http://wiki.pentaho.com/display/Pent...+Documentation

Packages for Weka>=3.7.2 can be browsed online at:
http://weka.sourceforge.net/packageMetaData/


What's new in 3.8.1/3.9.1?

Some highlights
---------------

In core weka:

* Package manager now handles redirects generated by SourceForge
* Package manager now employs a new class loading mechanism that attempts to avoid third-party library clashes by isolating the third-party libraries in each package
* new RelationNameModifier, SendToPerspective, WriteWekaLog, Job, StorePropertiesInEnvironment, SetPropertiesFromEnvironment, WriteDataToResult and GetDataFromResult steps in Knowledge Flow
* RandomForest now has an option for computing the mean impurity decrease variable importance scores
* JRip now prunes redundant numeric attribute-value tests from rules
* Knowledge Flow now offers an additional executor service that uses a single worker thread; steps can, if necessary, declare programatically that they should run in the single-threaded executor.
* GUIs with result lists now support multi-entry delete
* GUIs now support copying/pasting of array configurations to/from the clipboard

In packages:

* Multi-class FLDA in the discriminantAnalysis package
* New implementations in the ensemblesOfNestedDichotomies package
* distributedWekaBase now includes the latest version of Ted Dunning's t-digest quantile estimator, bringing a factor of 4 speedup over the old implementation
* New streamingUnivariateStats package
* RPlugin package updated to support the latest version of MLR
* New wekaDeepLearning4j package - provides a MLP classifier built using the DL4J library. Can work with either CPU-based or GPU-based native libraries
* New logarithmicErrorMetrics package
* New RankCorrelation package, courtesy of Quan Sun. Provides rank correlation metrics, Kendall tau and Spearman rho, for evaluating regression schemes
* New AffectiveTweets package, courtesy of Felipe Bravom. Provides text filters for sentiment analysis of tweets
* New AnalogicalModeling package, courtesy of Nathan Glenn. Provides an exemplar-based approach to modeling
* New MultiObjectiveEvolutionaryFuzzyClassifier package, courtesy of Carlos Martinez Cortes. Provides a fuzzy rule-based classifier
* New MultiObjectiveEvolutionarySearch package, courtesy of Carlos Martinez Cortes. Provides a search method that uses the ENORA multi-objective evolutionary algorithm


As usual, for a complete list of changes refer to the changelogs.

How to use REST Client with 2 URL with 1 authentication

$
0
0
I am implementing transformation to get data web service with REST Client.

This website need to be log in first for authentication.

So I have to make 2 URL ( 1 for authentication and another one for get data)

I try to make 2 steps of REST Client as below image



The first REST client step I use Post Method for login : I can login properly

The second REST client step I use Get Method to get data : I got the message "First you need to login to ..."



It seem these 2 REST Client steps work with individual connection.

If I use just 1 REST Client, how can I make the URL dynamically for login and get data ?

Any idea to make it work ?

Thank you
Viewing all 16689 articles
Browse latest View live