Quantcast
Channel: Pentaho Community Forums
Viewing all articles
Browse latest Browse all 16689

detect column names in files(csv, txt..etc) automatically

$
0
0
I'm wondering that can PDI obtain column(field) names in files automatically?

Normally, we have to click "Get fields" in steps such as "CSV Input" to obtain them, and following analysis steps(e.g. Sort rows) can be processed.



But we have to build an automatic PDI process for others to use, therefore different files input is possible.

I've tried using scripts but seems that they didn't work (I'm not familiar with JAVA...)

CPython plugin is also tried, , but failed with following messages:

------script here-----------
#python script
import pandas as pd
import csv

raw_data = pd.read_csv(${para_1},sep=',')
-----------------------------

2016/07/21 11:11:07 - CPython Script Executor.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected error
2016/07/21 11:11:07 - CPython Script Executor.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : java.lang.NullPointerException
2016/07/21 11:11:07 - CPython Script Executor.0 - at org.pentaho.python.PythonSession.executeScript(PythonSession.java:479)
2016/07/21 11:11:07 - CPython Script Executor.0 - at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.executeScript(CPythonScriptExecutor.java:446)
2016/07/21 11:11:07 - CPython Script Executor.0 - at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.executeScriptAndProcessResult(CPythonScriptExecutor.java:349)
2016/07/21 11:11:07 - CPython Script Executor.0 - at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.processBatch(CPythonScriptExecutor.java:338)
2016/07/21 11:11:07 - CPython Script Executor.0 - at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.processRow(CPythonScriptExecutor.java:243)
2016/07/21 11:11:07 - CPython Script Executor.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2016/07/21 11:11:07 - CPython Script Executor.0 - at java.lang.Thread.run(Thread.java:745)
2016/07/21 11:11:07 - CPython Script Executor.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=1)
2016/07/21 11:11:07 - temp - Transformation detected one or more steps with errors.
2016/07/21 11:11:07 - temp - Transformation is killing the other steps!
2016/07/21 11:11:07 - temp - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Errors detected!
2016/07/21 11:11:07 - Spoon - The transformation has finished!!
2016/07/21 11:11:07 - temp - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Errors detected!
2016/07/21 11:11:07 - temp - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Errors detected!



If there's any way to detect column names without by clicking "Get fields" button, please tell me and really thanks for the help !!!!!

Viewing all articles
Browse latest Browse all 16689

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>