Technical Note: Jq Array Transformation
Problem
Given a JSON data below, output the data in CSV/TSV format.
[
{
"id": 0,
"data": [0, 1, 2]
},
{
"id": 1,
"data": [1, 2, 3]
},
{
"id": 2,
"data": [3, 4, 5]
}
]
Expected output.
"0\t0,1,2"
"1\t1,2,3"
"2\t3,4,5"
How to
JQ provides @tsv and @csv function that convert data to the corresponding
format. Each row of the output CSV table should be formated to an array type before
these functions can consume. Naturally, we write the JQ query as below
jq '.[] | [.id, .data] | @tsv' datafile
The above command will output the error message:
jq: error (at <stdin>:14): array ([0,1,2]) is not valid in a csv row exit status 5
@tsv function needs every element in the array to be a string in order to
convert to TSV output. By the query .[] | [.id, .data], the second output is
still an array and jq is complaining that array is not a valid csv row. We
need to format the second .data to a string before feeding it to @tsv
function.
We could use JQ Join function to
create a string from an array by something like .data | join(", "). In order
for jq to evaluate a sub expression, we need to wrap it inside a (). We have
another jq query as below.
jq '.[] | [.id, (.data | join(", ")]' datafile
[
0,
"0,1,2"
]
[
1,
"1,2,3"
]
[
2,
"3,4,5"
]
Now we can feed the output to @tsv function to output in TSV format.
jq '.[] | [.id, (.data | join(", ")) ] | @tsv' datafile
"0\t0,1,2"
"1\t1,2,3"
"2\t3,4,5"
Conclusion
Wrap sub-expression inside a bracket () to allow tranformation of parts of data.