Kubectl JSON output handling and JQ

20210527140611

WHY

Kubernetes可以看作”Object Store”, Object格式: yaml or json, 两者等价.

Kubectl的get命令读取一系列Objects, 并以JSON格式返回. 这些JSON格式往往很大,有复杂的嵌套,常见的需求:

  • 获取Output的subset, 往往deep nested: 某些fields, 而不是全部JSON
  • Filter out irrelevant objects by fields: e.g.我们只想返回失败的job
  • Sort: sort by creation date, status, etc

相比每次将JSON当做文本,grep; 我们需要更贴合JSON的方式.

本质上JSON output是我们的数据源, 我们需要query language, 就像query relational database一样.

本文介绍用JQ, Jid处理kubectl输出, 应对常见的情景.

示例代码在github下载

Introduction to JQ

jq首先是binary tool, 用来query json; 由于强大的query功能,可视为独立的language, 类似awk.

先看jq提供的基本功能, Demo JSON

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
{ "store": {
"book": [
{ "category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{ "category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{ "category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{ "category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
# prettify JSON
cat example.json | jq .
# extract fields inside array element
cat example.json | jq '.store.book[1].price'
# extract field of array, 输出multiple elements
cat example.json | jq '.store.book[].price'
# array转化为iterator,即等价于用element多次调用, 不同于输出一个JSON array
cat example.json | jq '.store.book | .[]'
# 构建新的object
cat example.json | jq '.store.book[1] | {bookTitle: .title, bookPrice: .price}'
# 构建array, 将多个elements, construct为一个array
cat example.json | jq '[.store.book | .[]]'

JQ tutorial也提供了一些例子, 值得一读。

进一步了解 JQ

以下主要来自JQ Manual

以一个具体jq cmd为例: query k8s jobs, 以创建时间排序, filter, 只留下failed jobs, 并reconstruct object.

1
k get jobs -o json --sort-by=.metadata.creationTimestamp | jq -r '.items | .[] | select(.metadata.name | contains("openworld")) | select(.status.active != 1) | select(.status.conditions[0].type == "Failed") | {name: .metadata.name, time: .metadata.creationTimestamp, status: .status}' 

可见jq的query由一系列|组成,类似UNIX的pipe, 连接独立component的input, output.

每个component, 称为filter, 接收输入,完成输出. 每个filter包含不同的jq built-in operator, 实行不同的转换.

有些filter会产生multiple output(filter A), 当 pipe filter A的输出到filter B时,会runs the second filter for each element of the array; 即不用explicitly写for-loop, 直接pipe两者即可.

Every filter has input and output, 即使常数”123”也是filter: 不管输入是什么,输出”123”. JQ里一切皆filter.

Filters

1
2
3
4
5
6
7
8
9
10
.                   # 最简单的filter, echo input
.foo .foo.bar # 等价于 .foo | .bar, object index
# 严格形式为.["foo"], 如果key有特殊字符
.[2] # array index
.[2:5] # array slice
.[] # array iterator: 将single array转为multiple elements
filterA, filterB # fan out to 多个filter, combine它们的输出,
# 如echo "{}" | jq '1, 2, 3'输出3个digit, 3行
| # 连接两filter, 特殊的filter
(expr) # group operators, 视expr为expression

Types && Values

jq supports the same set of datatypes as JSON - numbers, strings, booleans, arrays, objects (which in JSON-speak are hashes with only string keys), and “null”.

针对Type转换的主要有[]: array construction及{}: object construction.

[]将multiple input转化为JSON array, 即single object, 如之前提到的

1
cat example.json | jq '[.store.book | .[]]'

.[]是将array打散为multiple output, 因此[].[]为逆过程.

{}reconstruct object: 原始object不能满足需要,我们根据输出,重新构造object, 之前例子:

1
| {name: .metadata.name, time: .metadata.creationTimestamp, status: .status}

有一个common case: 我们只想获得input object的some fields as new Object, 可以:

1
{title: .title | author: .author}

可以简写为: {title, author}

当我们想递归遍下降历所有的sub objects of input “Big” object: 用: ..:

1
cat example.json | jq '.. | .price?'

此例中,bicycle及book都有price, 我们都列出来; .bar??在没有barindex时不至于报错.

jq的基本元素介绍完了, 但实际使用,除了基本原理,了解built-in非常关键,毕竟advance场景, 需要自己实现operator时候不多。

Built-in operators and functions

列举一些常用的operator和function, 感觉在做functional programming.

select(boolean_expression)

重要的filter: 一般用来过滤array的elements:

如果input满足boolean_expression, 原样保留; 否则丢弃

boolean_expression可以调用input, 结合其他filter:

1
2
jq `[1,2,3] | map(select(. >= 2))`
Output: [2, 3]

another example:

1
2
3
jq '.[] | select(.id == "second")'
Input [{"id": "first", "val": 1}, {"id": "second", "val": 2}]
Output {"id": "second", "val": 2}

四则运算

+ -

初看起来很简单, 但功能丰富:

a + b连接两个filter, 将输入传递给both ab; 再将两者输出做”add”:

  • Numbers: 直接add
  • Arrays: concat 2 array into 1 bigger array
  • Strings: concat, similar as array
  • Objects: merge 2 objects, 右边覆盖左边

同理有-, 但仅支持两边都是numbersarray

* / %

一般仅对两个numbers, 也支持array, stringobjects, 但效果比较诡异,应该不常用,略过。

keys

keys, keys_unsorted

要求输入为object, 输出keys为array.

has(key):

  • input: object; output: boolean, 是否含有此key.
  • input: array: array是否有此element
1
2
3
jq 'map(has("foo"))'
Input [{"foo": 42}, {}]
Output [true, false]

in: if input key in given object, or array; inversed version of has

1
2
3
jq 'map(in([0,1]))'
Input [2, 0]
Output [false, true]

map(expr) map_values(expr)

  • input: array;
  • expr: expr or function, 对每个elements或fields应用expr;
  • output: 应用expr后的新array
1
2
3
jq 'map_values(.+1)'
Input {"a": 1, "b": 2, "c": 3}
Output {"a": 2, "b": 3, "c": 4}

可以传入其他built-in, 例如获取field type.

1
2
3
jq 'map(type)'
Input [0, false, [], {}, null, "hello"]
Output ["number", "boolean", "array", "object", "null", "string"]

filter by data type

arrays, objects, iterables, booleans, numbers, normals, finites, strings, nulls, values, scalars

These built-ins select only inputs that are arrays, objects, iterables (arrays or objects), booleans, numbers, normal numbers, finite numbers, strings, null, non-null values, and non-iterables, respectively.

只留下match的type

1
2
3
jq '[].[]|numbers, nulls]'
Input [[],{},1,"foo",null,true,false]
Output [1 null]

any all

input: array of boolean values
output: any任意一个true为true; all需要所有为true.

1
2
3
jq 'any'
Input [true, false]
Output true

array operation

  • sort, sort_by(path_expression)
    1
    2
    3
    jq 'sort_by(.foo)'
    Input [{"foo":4, "bar":10}, {"foo":3, "bar":100}, {"foo":2, "bar":1}]
    Output [{"foo":2, "bar":1}, {"foo":3, "bar":100}, {"foo":4, "bar":10}]
  • unique, unique_by
  • min, max, min_by, max_by
  • reverse

string operation

一系列的操作:

  • contains(s)
  • index(s), rindex(s)
  • inside
  • startWith(str), endWith(str)
  • split, join

String interpolation - \(foo)

1
2
3
jq '"The input was \(.), which is one less than \(.+1)"'
Input 42
Output "The input was 42, which is one less than 43"

length

“length” of values: 支持string, array, object

逻辑判断 operator

作用一目了然, 返回true or false:

  • ==, !=
    1
    2
    3
    4
    5
    6
    jq '.[] == 1'
    Input [1, 1.0, "1", "banana"]
    Output true
    true
    false
    false
  • >, >=, <=, <
  • and or not
    1
    2
    3
    jq '[true, false | not]'
    Input null
    Output [false, true]

Alternative operator: a // b: 若a不是falsenull, outputa; 否则输出b; 等于是给定”默认值”

1
2
3
jq '.foo // 42'
Input {}
Output 42

这些可在jq play上直接看结果.

在local可以直接jq -n 'expr'来测试query

Regular expressions (PCRE)

jq支持完整的research: 采用与php, ruby, sublime等相同的re library: Oniguruma

RE操作:

1
2
3
4
STRING | FILTER( REGEX )
STRING | FILTER( REGEX; FLAGS )
STRING | FILTER( [REGEX] )
STRING | FILTER( [REGEX, FLAGS] )
  • STRING为待match string, 作为input
  • FILTER is one of:
  • match: 找到match, 输出object
  • test: Like match, but does not return match objects, only true or false
  • capture: 保留capture name, 结果存到新object

FLAGS is a string consisting of one of more of the supported flags:

1
2
3
4
5
6
7
8
g - Global search (find all matches, not just the first)
i - Case insensitive search
m - Multi line mode ('.' will match newlines)
n - Ignore empty matches
p - Both s and m modes are enabled
s - Single line mode ('^' -> '\A', '$' -> '\Z')
l - Find longest possible matches
x - Extended regex format (ignore whitespace and comments)

测试一系列的match:

1
2
3
4
jq '.[] | test("a b c # spaces are ignored"; "ix")'
Input ["xabcd", "ABC"]
Output true
true

可以结合select来正则匹配: select(.metadata.name | test("test-"))

JQ Output format

特殊的filter, 一般在最终输出前做必要的formatting和encoding/decoding, 语法为@foo, 有如下几种:

  • @text: 实际调用tostring
  • @json: serialize to json
  • @html: HTML escaping: <>&'" map to &lt;, &gt;, &amp;, &apos;, &quot;.
  • @uri: percent-encoding
  • @csv, @tsv: input必须为array, 转为csv, tsv格式
  • @sh: escape for POSIX shell
  • @base64: base64 encode; @base64d: base64 decode

JQ, JSON path, Kubectl

JSON Path standard 提供了不同于jq的query syntax, 两者的比较见JQ doc: For JSONPath users

Kubectl本身支持JSONPath, 见官方doc, 但正如doc所说:

不支持 JSONPath 正则表达式。如需使用正则表达式进行匹配操作,您可以使用如 jq 之类的工具

1
kubectl get pods -o json | jq -r '.items[] | select(.metadata.name | test("test-")).spec.containers[].image'

用JID visualize field selection

Jid使用场景: 生成JSON query statement: 一步步的从一个大json中选取deep nested field, 并能autocomplete.

安装jid tool:

1
go get -u github.com/simeji/jid/cmd/jid

使用jid非常简单:

1
cat example.json | jid -q | xclip -selection c
20210526162922

依次指定field来filter, tab自动补全, 最后pipe到clipboard, ctrl+vxclip -selection c -o即可得到生成的query: .store.book[1].price

结论

我们的主要目标是用query kubectljson output, 一些建议:

  • kubectl使用--sort-by指定排序规则(只能为integerstring)
  • 如果JSON结果大,复杂,先用jid获取jq能直接使用的query path
  • kubectl的输出pipe到jq中,filter, 转换,得到想要的结果

Ref

JQ manual
Kubectl output options