Quantcast
Channel: Pentaho Community Forums
Viewing all articles
Browse latest Browse all 16689

MongoDB Input unwind

$
0
0
Mate - I am processing a json which is hierarchical in nature. It is a survey information. Once survey can have many questions, one questions can have many answers and one answers can have many plugins.

Source data:-
Code:

{
    "_id": {"$oid": "5673417f677aff45e70001c5"},
    "title": "Survey 01",
    "questions": [
    {
        "_id": {"$oid": "56734183677aff45e70001c6"},
        "body": "Which browser do you use? Any plugin to mention?",
        "answers": [
            {"_id": {"$oid": "56734183677aff45e70001c7"},
            "body": "Google Chrome",
            "plugins": [
                {
                "_id": {"$oid": "56ce0767988ceb49c5000002"},
                "name": "Ad blocker",               
                "browser_plugins_id": {"$oid": "566886c040041260650003cc"}
                },
                {
                "_id": {"$oid": "56ce076c988ceb49c5000003"},
                "name": "Chrome password manager",
                "browser_alert_id": {"$oid": "5644fa0a5241491f9abe0200"}
                }]
            },
            {
            "_id": {"$oid": "56ce07b2988ceb49c5000006"},
            "body": "Internet Explorer 11"
            },
            {
            "_id": {"$oid": "56ce07b4988ceb49c5000007"},
            "body": "Mozilla Firefox"
            }]
    },
    {
        "_id": {"$oid": "56ce079b988ceb49c5000004"},
        "body": "Which website you surf most?",
        "answers": [
            {
            "_id": {"$oid": "56ce079b988ceb49c5000005"},
            "title": "Facebook"
            }]
    }],
    "status": "active"
}

I need to produce normalize data and break it at plugins level. So I need total 5 records (4 answers and 2 plugins for one answer). I am attaching output (survey.xls) for reference.

I am having a working solution but it is not very much clean. I am reading everything from MongoDb in a single row with the help of array. So I am having 5 questions ($.questions[0]._id) then reading 5 answers for each quesitons ($.questions[0].answers[0]._id) and so on. I then normalize this data and filter out null questions, answers, etc.

Instead of that, I want to use something like unwind so that I can get this format directly out of MongoDB Input step. And which should take care of any number of questions or answers.

Any help will be much appreciated.

env:- PDI 5.4.0.1-131 CE, Windows 10, Java build 1.8.0_25-b18, MongoDB 3.2.3

Regards,
Ritesh
Attached Files

Viewing all articles
Browse latest Browse all 16689

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>