Hi...
I've been reading up & playing with Metadata Injection on PDI 7.0 community edition. In the examples I've seen (e.g. on
https://help.pentaho.com/Documentati...data_Injection & the example at %PDI%\data-integration\samples\transformations\meta-inject ), they both hardcode the possible fields somewhere (e.g. the "Fields" step in
use_metainject_step.ktr from the 2nd example above).
I'm trying to do something a little simpler. Can I use Metadata Injection in PDI (min 7.0?) to dynamically load into MongoDB
any type of flat file that meets certain criteria--for example: pipe-delimited with header row? So, I'd like to use the
same PDI import to load a flat file with 10 columns vs. another flat file with 500 columns--assuming both are pipe-delimited with a header row. My criteria is simple:
- Parse out the variable number of field names from row 1 & set the datatypes as text/string.
- Load rows 2+ as text/string into the columns created above. (I don't care about numbers, dates, etc.)
Here's the problem I'm trying to solve... For
years, people have been asking the makers of MongoDB to add a command-line parameter to
mongoimport that would allow the use of any delimiter, as opposed to only supporting JSON, CSV, tab-delimited. That request has fallen on deaf ears. I'm hoping that PDI 7.0+ with Metadata Injection could solve this problem, as converting the input flat files into a different format (e.g. tab-delimited) is not an option for us (we don't change incoming data & we're talking about 100k+ files).
Not asking for anyone to do this work for me--I am just looking to see if this is possible and if anyone has done it. Sounds simple enough but I haven't found such an example online, especially since PDI 7.0 & support for Metadata Injection throughout PDI is so new. Thanks!