Quantcast
Channel: Pentaho Community Forums
Viewing all 16689 articles
Browse latest View live

Reading a multiline field from AR Input

$
0
0
Hello Pentaho Community,

I encounter a problem when I try to read an AR form with a field with multiple lines within.
It looks like that on the form :

Quote:

Site: Place ThePlace
InstanceId: OI-randomchars
Room: R1 RM34
When I read this field on Pentaho, no matter which step I use, like regex manipulation, javascript or row operations... nothing works !
My output is a text file with semicolon separator and when convert on Excel, it scrambles the results and is inexploitable.
If I use an Excel file as output, it does good, but when I use it as input, it does the same bad thing.

Is there by any chance an explanation to how exploit this multiple line field ?
Or maybe Pentaho is not made to do such things ?

I have tried lot of things and nothing worked so far.
Thanks in advance for all the consideration about this.

org.pentaho.di.core.exception.KettleException: Cannot find repository "reportdev"

$
0
0
I'm unable to connect to one of my repositories with DI 5.4. repositoriesMeta is returning null when I try to find it by name.
Code used:
final RepositoriesMeta repositoriesMeta = new RepositoriesMeta();
repositoriesMeta.readData();
final RepositoryMeta repositoryMeta = repositoriesMeta.findRepository(repositoryName);

Here's what my repositories.xml looks like. It finds the file repository just fine, but it doesn't seem to find the enterprise repository. I'm able to connect to that repository with spoon, so I know it is there. Is there something I'm missing?

<repositories>
<repository> <id>PentahoEnterpriseRepository</id>
<name>reportdev</name>
<description>reportdev</description>
<repository_location_url>http://10.32.18.41:9080/pentaho-di</repository_location_url>
<version_comment_mandatory>N</version_comment_mandatory>
</repository>
<repository> <id>KettleFileRepository</id>
<name>pentahoTransforms</name>
<description>pentahoTransforms</description>
<base_directory>&#x2f;opt&#x2f;verdeeco&#x2f;sys&#x2f;core&#x2f;current&#x2f;pentaho&#x2f;pentahoTransforms</base_directory>
<read_only>N</read_only>
<hides_hidden_files>N</hides_hidden_files>
</repository>
</repositories>

How to go to the previous directory by using {Internal.Job.Filename.Directory} ?

$
0
0
How to go to the previous directory by using {Internal.Job.Filename.Directory} .... for example if the {Internal.Job.Filename.Directory} is C:\Pentaho_Lab\Test\ETL . so how I can store some file to the previous directory which is C:\Pentaho_Lab\Test by using the variable {Internal.Job.Filename.Directory}

Pentaho 6 startup Error (Mysql DB)

$
0
0
Hello there,
Need help with Pentaho 6. My environment is as follow:

Windows 7 64 bits
Oracle JDK 1.7.79 (64 bits)
MySQL version 5.6.26

Sample error appears in pentaho.log (i attached all log files in /tomcat/logs folder .logs.zip . Thanks in advance

2016-01-19 01:35:45,789 ERROR [org.apache.felix.configadmin.1.8.0] [[org.osgi.service.cm.ConfigurationAdmin]]Cannot use configuration org.pentaho.requirejs for [org.osgi.service.cm.ManagedService, id=550, bundle=187/mvn:pentaho/pentaho-requirejs-osgi-manager/6.0.1.0-386]: No visibility to configuration bound to mvn:pentaho/pentaho-server-bundle/6.0.1.0-386
2016-01-19 01:36:24,074 ERROR [org.pentaho.platform.repository2.unified.BackingRepositoryLifecycleManagerSystemListener]
org.pentaho.platform.api.engine.security.userroledao.AlreadyExistsException:
at org.pentaho.platform.security.userroledao.jackrabbit.JcrUserRoleDao.createRole(JcrUserRoleDao.java:123)
at org.pentaho.platform.repository2.mt.RepositoryTenantManager.createTenant(RepositoryTenantManager.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
Attached Files

Find first non-null value scanning rows backwards from current row

$
0
0
Intro

Hi there, this would be my first post on Pentaho Forums. I've been using PDI for some time now but the issue I'm facing now hasn't been yet approached by me in Kettle before. Actually, I haven't even stumbled upon such a thing in the past.

Software


I'm using Pentaho Data Integration 5.4

Input data & explanation

Input data from a file (simplified, there are more columns):

Code:

    number      name
    1009      ProductA
    2150      ProductB
    3235      ProductC
              ProductD
              ProductE
    1234      ProductF
    7765      ProductG
    4566      ProductH
              ProductI
    9907      ProductJ

The issue is that I had an Excel file format xlsx which has the data with merged cells, and for one value of id there are 1..n rows of values.

After converting that file to csv values for next rows (other than first) are missing, despite the one column which was not merged (see example id=3, id=6).

I'm generating a sequence using step Add sequence, the input is sorted the way it was originally stored in a file.

Steps to achieve the goal

Basically what I need to do is:


  1. Find first non-null value that has sequence_number less than current_row.sequence_number
  2. Concatenate the value from field name to that matching row
  3. Keep scanning next rows with sequence_number higher than the last scanned


As stated before, there can be 1..n rows of values for such case.

Expected output

Code:

    number      name
    1009      ProductA
    2150      ProductB
    3235      ProductC; ProductD; ProductE
    1234      ProductF
    7765      ProductG
    4566      ProductH; ProductI
    9907      ProductJ


My approach


I believe I'm able to do this in a loop, by using Analytic Query and calculating LAG(1) and then concatenating the column name for one row with null values and discarding other column values from null row - and then doing this in a loop (for like 20 times assuming this is maximum), but I do consider this a bad idea.

There are probably better ways to achieve this result using for example Modified Java Script Value step with scanning the rows backward from current (based on sequence number), but I'm unaware of those functions, if they do exist.

How can I achieve this using Modified Java Script Value step, or any other efficient way without using a loop for entire content of the file until there are no empty rows?

As an additional question,
Is there any place where there are docummented special functions for Java Script step that Pentaho uses? It'd probably be a lot easier if only I knew what I can do with existing functions whose existance I'm unaware of for now.

Pentaho Reporting

$
0
0
Hi,
I have a scenario where I need to group 2 types of line graphs with different line types i.e one with solid and one with dot in one chart. Can you please illustrate a method to do so.

Define measures on basis of dimensions

$
0
0
How do I define measures on basis of dimensions. Eg:- I want to disable specific measure on selection of certain dimension. How to achieve this in Schema Workbench?

Saiku OLAP Wizard error

$
0
0
hello all,
I am working with pentaho bi server 5.4. i was trying to use "Saiku OLAP Wizard" but got following error:
"Failed
No class registered for id saiku-ui
Server Version: Pentaho Open Source BA Server 5.4.0.1-130"
screen
hot of error is also attached
saiku error.jpg

How can i resolve this? Do I need to install any plugin? if yes where can i get this.
Thankyou
Preeti
Attached Images

CDE template and CSS

$
0
0
Hello all,

I encounter a little problem. I have created a nice dashboard and i saved it as a template so i can apply it for other departments.
This all works fine(data is correct and functionality stays the same).
My new dashboards listen to the CSS but its totally different from the original Dashboard
This means my lay-out is totally destroyed. I tried everything with rights and css in different folders but nothing seems to work.
Does anybody have any idea?

Kind Regards,

KBinfo measure differs with same training data

$
0
0
Hello,

I've got a problem with my RBF network. I thought I could use Kononenko & Bratko index (KBInfo) to get more accuracy on the final prediction. The problem is that KBInfo varies from run to run and I don't understand why.

For example, with a training file with 17000 instances I initially had a KBInfo of -0.89 and later on, when I had say 40 more instances in my training data, then KBInfo for the exact same individual test as before got a positive 0.3. OK, I though if I regenerated my training data with the original 17000 instances I would get the same -0.89 as the day before, but not, now, with the old training data I get a KBInfo of 0.29....

Haven't got the time to study in more detail what is happening. I'd like to know if someone has experienced this before and what can be causing this high variance in the KBInfo.

Thx & regards,
Jordi

Replace in String step in PDI leads to the issue - java.lang.IllegalArgumentException

$
0
0
Following is the exception encountered when using "Replace in String" step in PDI kettle. There is no variable or group name defined as GKTIMESTAMP in the transformation and regex based replacement is turned off, yet it appears in the exception message. The environment is of Linux 64 bit machine with PDI 5.0.1 stable version

java.lang.IllegalArgumentException: No group with name {GKTIMESTAMP}
at java.util.regex.Matcher.appendReplacement(Matcher.java:849)
at java.util.regex.Matcher.replaceAll(Matcher.java:955)
at org.pentaho.di.trans.steps.replacestring.ReplaceString.replaceString(ReplaceString.java:79)
at org.pentaho.di.trans.steps.replacestring.ReplaceString.getOneRow(ReplaceString.java:124)
at org.pentaho.di.trans.steps.replacestring.ReplaceString.processRow(ReplaceString.java:202)
at org.pentaho.di.trans.step.RunThread.run(RunThread.java:60)
at java.lang.Thread.run(Thread.java:745)

Can anyone suggest what could be the reason for such an error and how to resolve this error ?

Time series prediction with overlay data in Java

Consuming JSON variabel coming our of MongoDB as BSON

$
0
0
Hello forum,

I am tasked with parsing a document in MongoDB that has a number of nested arrays or elements of data - the number of the nested elements can be any so I cannot preset it manually on the MongoDB Input Transformation Component to a certain number (e.g. as $.Sensors[0].Value then $.Sensors[1].Value and so forth), hence I attempted to operate on the JSON variable coming out of the Mongo Input.

However, it appears actually not JSON complaint, but raw BSON instead (unlike specified in the Mongo Input component).
Therefore I have an issue parsing it using the JSON Path in a JavaScript Task for it being not JSON valid (using a library as helper).
Stripping the BSON metadata bits appears UN-reliable (for parasites in data).

I am curious if anyone has a BSON Path or BSON parser to extract the nested elements.

Impersonate User functionality

$
0
0
Hi,

I am looking for some advice on how to create an impersonate functionality.

I would like to be able (as admin user) to choose from a list of users on the system and to impersonate them and see my CDE dashboards from their point of view.

We have dynamic schemas meaning the cubes and dimensions within them vary per users as well as the CDE pages that a user could potentially access.

I was thinking to have a kettle endpoint that displays users and then one that receives a user to impersonate.

This would kick off another process to set the session variables perhaps?

And then use these when embedding another dashboard within the CDE page.

But this last part I am not having any luck implementing.

Anybody done anything like this before? Or have any suggestions on the best way to achieve this?

Thanks,

Blueprint Container Issue

$
0
0
I am trying to parse a larger than ordinary number of records. I do not have a dedicated Hadoop cluster but I do have several servers. My aim is to create a dynamic cluster on several machines. However, I am receiving 2 errors. The master starts and runs programs but the slave server is experiencing issues before and after registering successfully with the master.

How can I resolve these issues. I do not want to use the big-data-pluging hdfs capabilities and I cannot find ILineageClient.

Prior to successful registration, I am receiving the following error:


Code:

ERROR [KarafLifecycleListener] Error in Blueprint Watcher
org.pentaho.osgi.api.IKarafBlueprintWatcher$BlueprintWatcherException: Unknown error in KarafBlueprintWatcher
    at org.pentaho.osgi.impl.KarafBlueprintWatcherImpl.waitForBlueprint(KarafBlueprintWatcherImpl.java:89)
    at org.pentaho.di.osgi.KarafLifecycleListener$2.run(KarafLifecycleListener.java:112)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.pentaho.osgi.api.IKarafBlueprintWatcher$BlueprintWatcherException: Timed out waiting for blueprints to load: pdi-dataservice-server-plugin,pentaho-big-data-impl-shim-initializer,pentaho-big-data-impl-shim-hdfs,pentaho-big-data-impl-shim-pig,pentaho-big-data-impl-vfs-hdfs,pentaho-big-data-kettle-plugins-common-named-cluster-bridge,pentaho-big-data-kettle-plugins-guiTestActionHandlers,pentaho-big-data-kettle-plugins-pig,pentaho-hadoop-shims-mapr-osgi-jaas,pentaho-big-data-impl-clusterTests,pentaho-big-data-impl-shim-shimTests,pentaho-metaverse-core,pentaho-requirejs-osgi-manager,pentaho-angular-bundle,pentaho-marketplace-di
    at org.pentaho.osgi.impl.KarafBlueprintWatcherImpl.waitForBlueprint(KarafBlueprintWatcherImpl.java:77)
    ... 2 more

After loading properties I am successfully connecting for a time to the master.

Code:

2016/01/19 12:23:00 - Carte - Registered this slave server to master slave server [Master] on address [xxxxxxxxxxxxxxxx:8999]
2016/01/19 12:23:00 - Carte - Registered this slave server to master slave server [Master] on address [xxxxxxxxxxxxxxx:8999]
2016/01/19 12:23:00 - Carte - Created listener for webserver @ address : localhost:8199

However, after about a minute or so, object timeout is set to 1 minute. I get the following jobs. No tasks execute on the master before or after registration (actor model?).

Code:

[BlueprintContainerImpl] Unable to start blueprint container for bundle pdi-dataservice-server-plugin due to unresolved dependencies [(objectClass=org.pentaho.metaverse.api.ILineageClient)]
java.util.concurrent.TimeoutException
    at org.apache.aries.blueprint.container.BlueprintContainerImpl$1.run(BlueprintContainerImpl.java:336)


Passing div class values

$
0
0
I've got a data source query that returns three values. I can successfully get those values passed to my HTML, and displayed properly using <div id=""></div>. However, I want to style the panel those numbers are in differently based on those values; basically, if a number is positive, make it green, if negative, make it red, etc.

Code:

if (percentChange < 0) {
        panelClass = 'panel-footer custom-panel-success' ;
    } else {
    // Negative percent change
        panelClass = 'panel-footer custom-panel-warning';
    }

How can I get the div class to accept the panelClass variable value?

Transformation works in Spoon but not in my Java Application and there are no errors!

$
0
0
I am new to Pentaho Kettle and I have created several simple transformations and jobs in Spoon.

I have a Job that runs a transformation that simply pulls data from CSV files, adds a couple fields to each row, and sends the rows to MongoDB. I also have an error step(Write to Log) coming off of the MongoDB Output step.

The job and transformation run perfectly in Spoon and all rows appear in MongoDB.

However, when I run the Job from my Java App, everything runs perfectly except when it gets to the MongoDB Output step. There all the rows go to the Write to Log step and there are no errors recorded. :(

When I take out the Write to Log step there still no errors recorded and no rows are written to MongoDB.

I'm wondering if there is any DB configuration I need to do in my Java App but I thought that Kettle would take care of that for the transformation.

Remove Image Saiku at PDF/PNG chart

$
0
0
Hi! I need to know if its possible remove saiku image from pdf/png file when export a chart from saiku analytics. I'm using Saiku Community Edition.
I attached a file which contain a chart generate with saiku (the image is at the corner).

Thank you a lot!
Regards
Attached Images

Move xml files after sucesfull db write

$
0
0
I am looking for a clean solution how to move a processed xml file to another folder location. InputFilename and outputFilename are provided at the beginning of the job with results from previous transformation. Current transformation is called row by row, so at each iteration one file is processed, written to the db and finally moved to another location.

The problem I am facing is the following:
- if I put a "process file" component (move operation) after the table output it will get trigger as many times as there are returning results in the previous table output component so this solution does not work well in this case
- in the situation where a transformation waits for all table outputs to be finished there arise a problem with referencing the outputFilename field from the beginning of the transformation. Is there any way how to reference that column without assigning a variable at the beginning of the job?

Details in the picture:
000046.jpg

Thank you
Attached Images

Strange error encountered during Job execution

$
0
0
I am getting such error in my environment where the exception is encountered when using "Replace in String" step in PDI kettle. There is no variable or group name defined as GKTIMESTAMP in the transformation and regex based replacement is turned off, yet the field / group name GKTIMESTAMP appears in the exception message. The environment is of Linux 64 bit machine with PDI 5.0.1 stable version.


Another thing that I have noticed this time during the failure is, after the transformation name and step name in square brackets, the log message prints "null" (as highlighted in red as part of Failure log snipper) instead of the step name (as highlighted in green color in Successful log snippet), followed by processing status such as example "- Linenr 50000".


Failure log snippet:


2016-01-19 06:11:31,601 INFO [TaskHandlerJob IncrementalTask - Sorted Merge] null - Linenr 50000
2016-01-19 06:11:31,714 INFO [TaskHandlerJob IncrementalTask - Group byField] null - Linenr 50000
2016-01-19 06:11:32,341 INFO [TaskHandlerJob IncrementalTask - Sorted Merge] null - Linenr 100000
2016-01-19 06:11:32,455 INFO [TaskHandlerJob IncrementalTask - Group byField] null - Linenr 100000
2016-01-19 06:11:33,013 INFO [TaskHandlerJob IncrementalTask - Sorted Merge] null - Linenr 150000
2016-01-19 06:11:33,133 INFO [TaskHandlerJob IncrementalTask - Group byField] null - Linenr 150000
2016-01-19 06:11:33,875 ERROR [TaskHandlerJob IncrementalTask - replace_string] null - Unexpected error
2016-01-19 06:11:33,875 ERROR [TaskHandlerJob IncrementalTask - replace_string] null - java.lang.IllegalArgumentException: No group with name {GKTIMESTAMP}
at java.util.regex.Matcher.appendReplacement(Matcher.java:849)
at java.util.regex.Matcher.replaceAll(Matcher.java:955)
at org.pentaho.di.trans.steps.replacestring.ReplaceString.replaceString(ReplaceString.java:79)
at org.pentaho.di.trans.steps.replacestring.ReplaceString.getOneRow(ReplaceString.java:124)
at org.pentaho.di.trans.steps.replacestring.ReplaceString.processRow(ReplaceString.java:202)
at org.pentaho.di.trans.step.RunThread.run(RunThread.java:60)
at java.lang.Thread.run(Thread.java:745)




In ideal case or the case when the job ran successfully, the step name would appear instead of null as highlighted above. The corresponding log snippet is given below. Is there any reason why the above scenario has occurred ?


Successful log snippet:


2016-01-19 06:11:31,601 INFO [TaskHandlerJob IncrementalTask - Sorted Merge] Sorted Merge - Linenr 50000
2016-01-19 06:11:31,714 INFO [TaskHandlerJob IncrementalTask - Group byField] Group byField - Linenr 50000
2016-01-19 06:11:32,341 INFO [TaskHandlerJob IncrementalTask - Sorted Merge] Sorted Merge - Linenr 100000
2016-01-19 06:11:32,455 INFO [TaskHandlerJob IncrementalTask - Group byField] Group byField - Linenr 100000
2016-01-19 06:11:33,013 INFO [TaskHandlerJob IncrementalTask - Sorted Merge] Sorted Merge - Linenr 150000
2016-01-19 06:11:33,133 INFO [TaskHandlerJob IncrementalTask - Group byField] Group byField - Linenr 150000
Viewing all 16689 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>