Technical Note: Jq Array Transformation
Problem
Given a JSON data below, output the data in CSV/TSV format.
[
{
"id": 0,
"data": [0, 1, 2]
},
{
"id": 1,
"data": [1, 2, 3]
},
{
"id": 2,
"data": [3, 4, 5]
}
]
Expected output.
"0\t0,1,2"
"1\t1,2,3"
"2\t3,4,5"
How to
JQ provides @tsv
and @csv
function that convert data to the corresponding
format. Each row of the output CSV table should be formated to an array type before
these functions can consume. Naturally, we write the JQ query as below
jq '.[] | [.id, .data] | @tsv' datafile
The above command will output the error message:
jq: error (at <stdin>:14): array ([0,1,2]) is not valid in a csv row exit status 5
@tsv
function needs every element in the array to be a string in order to
convert to TSV output. By the query .[] | [.id, .data]
, the second output is
still an array and jq
is complaining that array is not a valid csv row. We
need to format the second .data
to a string before feeding it to @tsv
function.
We could use JQ Join function to
create a string from an array by something like .data | join(", ")
. In order
for jq
to evaluate a sub expression, we need to wrap it inside a ()
. We have
another jq
query as below.
jq '.[] | [.id, (.data | join(", ")]' datafile
[
0,
"0,1,2"
]
[
1,
"1,2,3"
]
[
2,
"3,4,5"
]
Now we can feed the output to @tsv
function to output in TSV format.
jq '.[] | [.id, (.data | join(", ")) ] | @tsv' datafile
"0\t0,1,2"
"1\t1,2,3"
"2\t3,4,5"
Conclusion
Wrap sub-expression inside a bracket ()
to allow tranformation of parts of data.