Command line execution uploading model + filters

June 27, 2013, 1:30 am

≫ Next: WebBased report creation, Report spec is not valid with field data type "URL"

≪ Previous: java.lang.NoClassDefFoundError: Could not initialize class sun.nio.ch.FileChannelImpl

Hi,
I'm using this command line to train a model:

Code:

 java -Djava.util.Arrays.useLegacyMergeSort=true weka.classifiers.meta.FilteredClassifier 

   -t $MYPATH/inputs/$1.arff 

   -x 3 -s ${i} -p 1 -distribution 

   -d $MYPATH/results/$1.model2 

   -F "weka.filters.MultiFilter  

        -F \"weka.filters.unsupervised.attribute.Remove -R 9\" 

        -F \" weka.filters.unsupervised.attribute.RemoveType -T string\" 

        -F \"weka.filters.supervised.instance.SMOTE -C 0 -K 5 -P $2 -S 1\"" 

  -W weka.classifiers.functions.MultilayerPerceptron -- -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H 0  > $MYPATH/results/100test/$1.${i}-p2.rbf

Now I want to use the model obtained to test new datasets. These datasets also need to have the attribute 9 removed, and also contain string attributes. I'm trying different combinations but I can't get the correct one. For example:

Code:

 java -Djava.util.Arrays.useLegacyMergeSort=true weka.classifiers.meta.FilteredClassifier       

       -l $MYPATH/$1/results/$1.model2 

       -T $MYPATH/${swp}/inputs/${swp}.arff 

       -x 3 -s ${i} -p 1 -distribution 

       -F "weka.filters.unsupervised.attribute.Remove -R 9" 

       -W weka.classifiers.functions.MultilayerPerceptron -- -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H 0 > $MYPATH/$1/results/100test/$1.${i}.$swp.rbf

But it complains about illegal options.
Any suggestion about the structure this command should have?
Thanks in advance

↧

WebBased report creation, Report spec is not valid with field data type "URL"

June 27, 2013, 1:30 am

≫ Next: SETting Email in DRAFTs of Mailbox

≪ Previous: Command line execution uploading model + filters

Hi,

I have a problem creating a report from the web based report creation tool with a field of type "URL" from the metadata.

The report (simple listing) is working if I change the data type to String in metadata-editor, If I convert to URL:

AdhocWebService.ERROR_0002 - Report spec is not valid

The logs shows:

2013-06-27 10:29:45,310 ERROR [org.pentaho.jfreereport.wizard.utility.CastorUtility] For input string: "undefined"
For input string: "undefined"{file: [not available]; line: 58; column: 129}
at org.exolab.castor.xml.Unmarshaller.unmarshal(Unmarshaller.java:732)

Any help will be apreciated, thanks!

↧

SETting Email in DRAFTs of Mailbox

June 27, 2013, 4:34 am

≫ Next: where is location source code of [org.pentaho.platform.web.http.security]?

≪ Previous: WebBased report creation, Report spec is not valid with field data type "URL"

hi all

Because i am facing problems with sending mails through smtp-server, i thought if it's possible just to "create an email in Drafts folder of outlook with email-adress of reciever and an attachement"? ..to do this in Pentaho DI 4.2.0? so that the email will be sent from outlook, not from kettle. is there any possibilities in kettle?

thank you very much in advance

↧

where is location source code of [org.pentaho.platform.web.http.security]?

June 27, 2013, 4:40 am

≫ Next: Inserting new Column with present Date

≪ Previous: SETting Email in DRAFTs of Mailbox

I want to modify the source code of org.pentaho.platform.web.http.security.RequestParameterAuthenticationFilter

do you know where is the jar/java source code in pentaho???

Thank so much

↧

Inserting new Column with present Date

June 27, 2013, 4:42 am

≫ Next: "Dependency level" in function not working?

≪ Previous: where is location source code of [org.pentaho.platform.web.http.security]?

hi all

I have some datas as text-files. i need to add a c new column with the current date-time to these datas in text-files, but not changing the name of files.

thanks for your helps in advance

↧

"Dependency level" in function not working?

June 27, 2013, 4:54 am

≫ Next: Sql server 2012 and BI server integration errors

≪ Previous: Inserting new Column with present Date

Hi, I've got an issue with a report.

I've created a report and It might have a problem handling the "dependency level" property of a function.
What I want to do is a kind of conditional Sum that only adds up values for only every first row of the defined group (In the example attached report that group is called "Order").
Since I couldn't find a "Conditional Sum" function I handled it with three functions. One to calculate the row number within the group, another one to set a value just to the first row (otherwise set 0), and the other one to make the sum of them.
That Sum return "77" and it should be "122". I realized that it's adding 28 and 49, and it should be 27, 20, 41 y 34. If I change the condition in the function "first_row_value" to "=IF([row_number_in_order] = 0;[QUANTITYORDERED];0)" then Sum returns 122. So I guess PRD could be having troubles with dependency level.

It's not that simple to explain it by words, ha.. Hope the attached example helps to clear things up.

If someone could help me it would be great!

Attached Files

test_conditional_sum.prpt (4.8 KB)

↧

Sql server 2012 and BI server integration errors

June 27, 2013, 6:05 am

≫ Next: How can we use WEKA via MATLAB? Isthere any useful link, tutorial or video?

≪ Previous: "Dependency level" in function not working?

Dear All,
I'm newbie in Pentaho BI Server Community version.
I'm using biserver-ce-4.8.0 on Windows 7 and Sql Server 2012.
I was able to successfully install biserver as a service.
I can start Admin Console (with eror messages) but I can't create any Database Connection.
I have also noticed that there are errors reported in stdout.log file (see attachment)
I'm counting on experienced Pentaho users to get help to solve the problems.
Thanks in advance.
Adam

Attached Files

stdout.log (15.6 KB)

↧

How can we use WEKA via MATLAB? Isthere any useful link, tutorial or video?

June 27, 2013, 6:15 am

≫ Next: Report Designer html-anchor based on group field value

≪ Previous: Sql server 2012 and BI server integration errors

Hi All,

Is there any documentation, link, web page or any other source to use WEKA over MATLAB?

Thanks..

↧

Report Designer html-anchor based on group field value

June 27, 2013, 6:28 am

≫ Next: Is it possible to do classification, based on the features that I specify manually?

≪ Previous: How can we use WEKA via MATLAB? Isthere any useful link, tutorial or video?

Hello,

I've got a group in a report, and when someone clicks on a slice of a pie chart, I want to jump to an anchor in the report below.

I have the pie chart generating the correct HTML links: http://myreport/index.html#SectionA, http://myreport/index.html#SectionB, and so on.

This is correct.

Problem is, on my group header I'm printing the field value, but cannot get the html-anchor to display each section.

I've tried =[Section], =["Section"], ${Section}, =${Section}, and everything I can think of in the html-anchor style. It generates the <a name='...'> tag, but the ... is always literally what I put into the html-anchor field, it does not use the current row value for the field...

Thanks

↧

Is it possible to do classification, based on the features that I specify manually?

June 27, 2013, 7:28 am

≫ Next: InstanceQuery Java heap space

≪ Previous: Report Designer html-anchor based on group field value

Hi All,

I want to make a classification with the features I specify manually.

Is it possible in WEKA?

Thanks...

↧

InstanceQuery Java heap space

June 27, 2013, 9:01 am

≫ Next: What did you learn about Kettle that made your job suddenly easier?

≪ Previous: Is it possible to do classification, based on the features that I specify manually?

hi

I have my information on postgres I try to do this: but I get Error java heap space
my configuration of jvm is -Xmx1024M

Instances training;
InstanceQuery query = new InstanceQuery();
String url="jdbc:postgresql://localhost:5432/"+dataBase+"?"+"user="+usuario+"&"+"password="+pass+"";
query.setDatabaseURL(url);
query.setQuery(sql);
training = query.retrieveInstances();

my dataset is big
any suggestions?

thanks...

↧

What did you learn about Kettle that made your job suddenly easier?

June 27, 2013, 9:09 am

≫ Next: Get Data XML parsing with nested nodes

≪ Previous: InstanceQuery Java heap space

Hi!

Here is my little gold nugget:

I was always wondering why I had to specify a DB connection for every transformation in a job. I just realized you can "share" a DB connection which is specified in a Job and make it by that accessible for every transformation used in that job!

Clipboard02.jpg

Curious about your wisdom :)

Cheers

Raffael

Attached Images

Clipboard02.jpg (17.2 KB)

↧

Get Data XML parsing with nested nodes

June 27, 2013, 10:00 am

≫ Next: milliseconds supported in PDI

≪ Previous: What did you learn about Kettle that made your job suddenly easier?

Hello Guys!
I'm trying to solve a parsing problem in PDI. This is the attachment: getMesh.zip
I have this input:

Quote:

<?xml version="1.0"?>
<DescriptorRecordSet LanguageCode = "eng">
<DescriptorRecord DescriptorClass = "1">
<DescriptorUI>D000008</DescriptorUI>
<DescriptorName>
<String>Abdominal Neoplasms</String>
</DescriptorName>
<DateCreated>
<Year>1999</Year>
<Month>01</Month>
<Day>01</Day>
</DateCreated>
<DateRevised>
<Year>1995</Year>
<Month>06</Month>
<Day>08</Day>
</DateRevised>
<AllowableQualifiersList>
<AllowableQualifier>
<QualifierReferredTo>
<QualifierUI>Q000737</QualifierUI>
<QualifierName>
<String>chemistry</String>
</QualifierName>
</QualifierReferredTo>
<Abbreviation>CH</Abbreviation>
</AllowableQualifier>
<AllowableQualifier>
<QualifierReferredTo>
<QualifierUI>Q000821</QualifierUI>
<QualifierName>
<String>virology</String>
</QualifierName>
</QualifierReferredTo>
<Abbreviation>VI</Abbreviation>
</AllowableQualifier>
</AllowableQualifiersList>
<Annotation>general term for neopl of organs in the abdom cavity; prefer specific organ/neopl terms; /blood supply /chem /second /secret /ultrastruct permitted; coord IM with histol type of neopl if given (IM)
</Annotation>
<SeeRelatedList>
<SeeRelatedDescriptor>
<DescriptorReferredTo>
<DescriptorUI>D034861</DescriptorUI>
<DescriptorName>
<String>Abdominal Wall</String>
</DescriptorName>
</DescriptorReferredTo>
</SeeRelatedDescriptor>
</SeeRelatedList>
<TreeNumberList>
<TreeNumber>1</TreeNumber>
<TreeNumber>2</TreeNumber>
</TreeNumberList>
<ConceptList>
<Concept PreferredConceptYN="Y">
<ConceptUI>M0000008</ConceptUI>
<ConceptName>
<String>Abdominal Neoplasms</String>
</ConceptName>
<ConceptUMLSUI>C0000735</ConceptUMLSUI>
<SemanticTypeList>
<SemanticType>
<SemanticTypeUI>T191</SemanticTypeUI>
<SemanticTypeName>Neoplastic Process</SemanticTypeName>
</SemanticType>
</SemanticTypeList>
<TermList>
<Term ConceptPreferredTermYN="Y" IsPermutedTermYN="N" LexicalTag="NON" PrintFlagYN="Y" RecordPreferredTermYN="Y">
<TermUI>T000016</TermUI>
<String>Abdominal Neoplasms</String>
<DateCreated>
<Year>1999</Year>
<Month>01</Month>
<Day>01</Day>
</DateCreated>
<EntryVersion>ABDOMINAL NEOPL</EntryVersion>
<ThesaurusIDlist>
<ThesaurusID>NLM (1966)</ThesaurusID>
</ThesaurusIDlist>
</Term>
<Term ConceptPreferredTermYN="N" IsPermutedTermYN="Y" LexicalTag="NON" PrintFlagYN="N" RecordPreferredTermYN="N">
<TermUI>T000016</TermUI>
<String>Abdominal Neoplasm</String>
</Term>
<Term ConceptPreferredTermYN="N" IsPermutedTermYN="Y" LexicalTag="NON" PrintFlagYN="N" RecordPreferredTermYN="N">
<TermUI>T000016</TermUI>
<String>Neoplasm, Abdominal</String>
</Term>
<Term ConceptPreferredTermYN="N" IsPermutedTermYN="Y" LexicalTag="NON" PrintFlagYN="N" RecordPreferredTermYN="N">
<TermUI>T000016</TermUI>
<String>Neoplasms, Abdominal</String>
</Term>
</TermList>
</Concept>
</ConceptList>
</DescriptorRecord>
</DescriptorRecordSet>

I need to map it in a table and I was looking for a way to produce the result.
You can see here it the mapping between column and XPATH query

mh_ui /DescriptorRecordSet/DescriptorRecord/DescriptorUI
mh_name /DescriptorRecordSet/DescriptorName/String
mh_year /DescriptorRecordSet/DescriptorRecord/DateCreated/Year
mh_subheadings /DescriptorRecordSet/DescriptorRecord/AllowableQualifiersList/QualifierReferredTo/QualifierName/String As you can understand here I have to match all entries not only the first and I should copy in the stream a new row for every new Qualifier that I found and the other field must contain the value for that record.
mh_reference /DescriptorRecordSet/DescriptorRecord/SeeRelatedList/SeeRelatedDescriptor/DescriptorReferredTo/DescriptorName/String the same problem for this as before
mh_description /DescriptorRecordSet/DescriptorRecord/ConceptList/Concept/ScopeNote the same problem for this as before
mh_sinonimous /DescriptorRecordSet/DescriptorRecord/ConceptList/TermList/Term/String the same problem for this as before but in this case I should also concat it and separate every string by pipe

So I'm able to parse the simple record but not the list with the step "Get data from XML" how could I solve it? I attached my sample step with my chunk.xml
The original file has size about 300MB.

Thank you so much
Any help is appreciated !

getMesh.zip

Attached Files

getMesh.zip (4.8 KB)

↧

milliseconds supported in PDI

June 27, 2013, 10:41 am

≫ Next: Pentaho Report Designer 3.8.3 or 3.9.1 Freezes on Snow Leopard after OSX update

≪ Previous: Get Data XML parsing with nested nodes

hi,

i am trying to load date with values in milliseconds - example 2013-06-27 17:50:40:414.
is this supported in pdi CE 4.4?

Thanks

↧

Pentaho Report Designer 3.8.3 or 3.9.1 Freezes on Snow Leopard after OSX update

June 27, 2013, 12:07 pm

≫ Next: Bringing data up from a Sub-report into the Main Report

≪ Previous: milliseconds supported in PDI

Anyone else having a problems running the Pentaho Report Designer on OSX Snow Leopard after the last java update?
Diagnostics may be found at this thread if you're interested: http://forums.pentaho.com/showthread...0-6-8&p=345475

Any help appreciated!!!!

↧

Bringing data up from a Sub-report into the Main Report

June 27, 2013, 12:35 pm

≫ Next: UDJC gives error "Unable to find info row set for step ..."

≪ Previous: Pentaho Report Designer 3.8.3 or 3.9.1 Freezes on Snow Leopard after OSX update

Is there a way to bring up data from a sub-report into the main report? From what I've seen on the web, it looks like it should be possible, I just haven't figured it out yet. Any hints or tips would be great. I'm using Pentaho Report Designer 3.9.1-GA.

Thanks,
BG

↧

UDJC gives error "Unable to find info row set for step ..."

June 27, 2013, 12:39 pm

≫ Next: Bulk Load into Infobright

≪ Previous: Bringing data up from a Sub-report into the Main Report

Hello,

I have played a little bit around with the UDJC step today and tried to modify the "Working with info steps" tutorial from this blog.
For some reason I cannot explain, it gives me the error "Unable to find info row set for step 'Rates step'"

I have attached the transformation, so you can have a look into it.
Thanks for any advice!

Bobse

UDJC Info step.ktr

Attached Files

UDJC Info step.ktr (12.7 KB)

↧

Bulk Load into Infobright

June 27, 2013, 1:44 pm

≫ Next: Import Data from File to DW

≪ Previous: UDJC gives error "Unable to find info row set for step ..."

Hi,

i am trying to do bulk load into Infobright using PDI. But i dont see a bulk loader step. Do i have to use MY SQL Bulk Loader?

Thanks

↧

Import Data from File to DW

June 27, 2013, 2:13 pm

≫ Next: Performance proble for PluginRegistry.init() on eclipse/tomcat

≪ Previous: Bulk Load into Infobright

Hello.

I need to import data from an xls file each hour for my datawarehouse.
I'm thinking about cde used to create a dashboard where the user adds the file path and this is used in a step in ktr.
Anyone know how this can be done, or is there a better way?

Thanks !!!
Emannuel Roque

↧

Performance proble for PluginRegistry.init() on eclipse/tomcat

June 27, 2013, 2:50 pm

≫ Next: Spoon Throws JAXBContext error when connecting to remote Linux repository

≪ Previous: Import Data from File to DW

Hi,

I initialize Kettle at the start of my j2ee application.
On tomcat on my linux server it's pretty quick but on my developpement environment based on eclipse and tomcat it takes around 10 min.

taken into debug it is struck on PluginRegistry.init() in the initialize sequence.

Did somebody already seen this problem?

Thx

Jer

↧