Technical Note: Jq Array Transformation


Problem

Given a JSON data below, output the data in CSV/TSV format.

[
    {
        "id": 0,
        "data": [0, 1, 2]
    },
    {
        "id": 1,
        "data": [1, 2, 3]
    },
    {
        "id": 2,
        "data": [3, 4, 5]
    }
]

Expected output.

"0\t0,1,2"
"1\t1,2,3"
"2\t3,4,5"

How to

JQ provides @tsv and @csv function that convert data to the corresponding format. Each row of the output CSV table should be formated to an array type before these functions can consume. Naturally, we write the JQ query as below

jq '.[] | [.id, .data] | @tsv' datafile

The above command will output the error message:

jq: error (at <stdin>:14): array ([0,1,2]) is not valid in a csv row exit status 5

@tsv function needs every element in the array to be a string in order to convert to TSV output. By the query .[] | [.id, .data], the second output is still an array and jq is complaining that array is not a valid csv row. We need to format the second .data to a string before feeding it to @tsv function.

We could use JQ Join function to create a string from an array by something like .data | join(", "). In order for jq to evaluate a sub expression, we need to wrap it inside a (). We have another jq query as below.

jq '.[] | [.id, (.data | join(", ")]' datafile
[
  0,
  "0,1,2"
]
[
  1,
  "1,2,3"
]
[
  2,
  "3,4,5"
]

Now we can feed the output to @tsv function to output in TSV format.

jq '.[] | [.id, (.data | join(", ")) ] | @tsv' datafile
"0\t0,1,2"
"1\t1,2,3"
"2\t3,4,5"

Conclusion

Wrap sub-expression inside a bracket () to allow tranformation of parts of data.