How to extract data from a JSON file

Question

I have been searching for a solution for my question but didn't find a or better said I did not get it with what I found. My problem is: I am using a Smart Home Control Software on a Raspberry Pi. Using pilight-receive, I can capture the data from my outdoor temperature sensor. The output of pilight-receive looks like that:

{
        "message": {
                "id": 4095,
                "temperature": 409.5
        },
        "origin": "receiver",
        "protocol": "alecto_wsd17",
        "uuid": "0000-b8-27-eb-0f3db7",
        "repeats": 3
}
{
        "message": {
                "id": 1490,
                "temperature": 25.1,
                "humidity": 40.0,
                "battery": 1
        },
        "origin": "receiver",
        "protocol": "alecto_ws1700",
        "uuid": "0000-b8-27-eb-0f3db7",
        "repeats": 3
}
{
        "message": {
                "id": 2039,
                "temperature": 409.5
        },
        "origin": "receiver",
        "protocol": "alecto_wsd17",
        "uuid": "0000-b8-27-eb-0f3db7",
        "repeats": 4
}

Now my question is: How the can I extract the temperature and humidity from messages where the id is 1490? And how would you recommend me to do check this frequently? By a cron job that runs every 10 minutes, creates an output of the pilight-receive, extracts the data of the output and pushes it to the Smart Home Control API?

The format seems to be [JSON](https://en.wikipedia.org/wiki/JSON). There are plenty of ways to parse JSON. It depends on what you are comfortable with. Python? JavaScript? Something else? — muru, Nov 16 '15 at 21:29
I know a bit of Python and a bit of JavaScript mostly I know C++ and C#. But after seeing all the awk and sed commands I though the must be some easy command xD — Raul Garcia Sanchez, Nov 16 '15 at 21:53
It's not difficult with `awk` and `sed` provided the JSON output retains the formatting shown here, which it need not - whitespace doesn't matter for JSON. For example, this `awk` command: `awk '/temperature|humidity/ {print $2}'` is close. — muru, Nov 16 '15 at 21:59
with `ksh93` json parsing is builtin to [`read`](http://unix.stackexchange.com/a/207125/52934). — mikeserv, Nov 17 '15 at 02:04
See my answer which simply uses `grep` and accomplishes the job just fine, with very little margin for error. — rubynorails, Nov 17 '15 at 03:38
Wow I didn't expect that amount of answers. Thanks a lot to everybody. I will try the jq command posted by cas. Therefore I have to update my raspberry to Jessie, as the package is not available in wheezy. However I will do so tonight, to achieve my goal :) — Raul Garcia Sanchez, Nov 17 '15 at 07:43
check wheezy-backports. it might be in there, saving you an upgrade to jessie (unless you were planning to upgrade anyway). aha! it IS backported to wheezy. https://packages.debian.org/wheezy-backports/jq — cas, Nov 17 '15 at 07:45
oh I didn't know something about this repo. I will definitly try it. Thanks in again :) — Raul Garcia Sanchez, Nov 17 '15 at 08:25

cas · Accepted Answer · 2015-11-17T02:02:18.153

43

You can use jq to process json files in shell.

For example, I saved your sample json file as raul.json and then ran:

$ jq .message.temperature raul.json 
409.5
25.1
409.5
$ jq .message.humidity raul.json 
null
40
null

jq is available pre-packaged for most linux distros.

There's probably a way to do it in jq itself, but the simplest way I found to get both the wanted values on one line is to use xargs. For example:

$ jq 'select(.message.id == 1490) | .message.temperature, .message.humidity' raul.json | xargs
25.1 40

or, if you want to loop through each .message.id instance, we can add .message.id to the output and use xargs -n 3 as we know that there will be three fields (id, temperature, humidity):

jq '.message.id, .message.temperature, .message.humidity' raul.json | xargs -n 3
4095 409.5 null
1490 25.1 40
2039 409.5 null

You could then post-process that output with awk or whatever.

Finally, both python and perl have excellent libraries for parsing and manipulating json data. As do several other languages, including php and java.

edited Nov 17 '15 at 02:02

answered Nov 16 '15 at 22:11

cas

1
7
119
185

2

specifically, `jq 'select(.message.id == 1490) | .message.temperature, .message.humidity' raul.json` – glenn jackman Nov 16 '15 at 22:27
yep. i've been trying to get jq to print them on one line. – cas Nov 16 '15 at 22:29
`jq ... | paste -d " " - -` – glenn jackman Nov 16 '15 at 22:30
1

or, in bash, `{ read temp; read hum; } < <(jq ...)` – glenn jackman Nov 16 '15 at 22:31
1

See my answer which simply uses `grep`. It may not work for some specific versions of `grep`, but it's more straight-forward than `jq` in this scenario, even though `jq` is designed specifically for parsing JSON. I did give the `jq` answer an upvote though, regardless. It is indeed a tool for the job, but sometimes you can simply remove staples with your fingers rather than searching around for a staple-remover. – rubynorails Nov 17 '15 at 03:32
2

json can't be reliably parsed with regular expressions any more than xml or html can. and most json data (e.g. fetched via a web api) doesn't come nicely formatted with extra line-feeds and indentation. to parse json reliably, you need a json parser. `jq` is one such for shell scripts. other languages have json parsing libraries. – cas Nov 17 '15 at 03:43
Thanks a lot everyone I will use your recommended jq command to achieve my goal. – Raul Garcia Sanchez Nov 17 '15 at 07:44
1

anything can be reliably parsed with regular expressions. it just depends on how *many* you use. how do you think `jq` does it? – mikeserv Nov 17 '15 at 14:26
@mikeserv, no. not everything. no matter how many you use. jq does it with a lot more than just regular expressions. a lexical grammar for example. start here: https://en.wikipedia.org/wiki/Lexical_analysis – cas Nov 17 '15 at 21:00
right, such as w/ the regex scanner `lex`? – mikeserv Nov 17 '15 at 22:26
the scanner is only a tiny part of a parser, and doesn't even necessarily use regexes (sometimes a scanf() is all that's needed). by far the larger and more important part is the grammar which encodes an "understanding" of the language's symbols and syntax. regexes "understand" nothing, they just search for patterns. – cas Nov 17 '15 at 22:30
`lex`'s regular expression scanners are stateful. anyone implementing a scanner which does *not* take advantage of that fact is *wasting* a perfectly good resource. a regular expression isn't designed to *understand* anything - it is designed to find and extract. – mikeserv Nov 19 '15 at 18:13

rubynorails · Answer 2 · 2015-11-17T03:22:24.083

For those who don't understand advanced awk as well as they'd like to (such as people like me) and don't have jq pre-installed, an easy solution would be piping a couple of native commands together like so:

grep -A2 '"id": 1490,' stats.json | sed '/1490/d;s/"//g;s/,//;s/\s*//'

If you're only trying to get the values, it's easier just using grep rather than awk or sed:

grep -A2 '"id": 1490,' stats.json | grep -o "[0-9]*\.[0-9]*"

To provide an explanation, this seems like the simplest way to me.

The grep -A2 grabs the line you are looking for in the JSON along with the following 2 lines, which contain the temperature and humidity.
The pipe to grep -o simply prints only numerical digits separated by a . (which will never occur on the first 1490 line, so you are left with your 2 values -- temperature and humidity. Very simple. Even simpler than using jq, in my opinion.

nwk · Answer 3 · 2015-11-19T13:18:17.510

1

My tool of choice for processing JSON on the command line is jq. However, if you don't have jq installed you can do pretty well with Perl:

# perl -MJSON -e '$/ = undef; my $data = <>; for my $hash (new JSON->incr_parse($data)) { my $msg = $hash->{message}; print "$msg->{temperature} $msg->{humidity}\n" if $msg->{id} == 1490 }' < data.json
25.1 40

edited Nov 19 '15 at 13:18

answered Nov 18 '15 at 21:56

nwk

979
6
17

score 0 · Answer 4 · answered Nov 16 '15 at 22:36

0

jq is by far the most elegant solution. With awk you could write

awk -v id=1490 '
    $1 == "\"id\":" && $2 == id"," {matched = 1}
    $1 == "}," {matched = 0}
    matched && $1 ~ /temperature|humidity/ {sub(/,/,"", $2); print $2}
' file

answered Nov 16 '15 at 22:36

glenn jackman

84,176
15
116
168

score 0 · Answer 5 · answered Nov 12 '18 at 17:47

your output is a set of JSON snippets rather than a complete JSON. If / once you rearrange your output to be an integral JSON, e.g. like this (assuming your output is in file.json):

echo "[ $(cat file.json | sed -E 's/^}$/},/; $d') }]"

then it's easy to achieve what you want with jtc tool (available at: https://github.com/ldn-softdev/jtc):

bash $ echo "[ $(cat file.json | sed -E 's/^}$/},/; $d') }]" | jtc -x "[id]:<1490>d [-1]" -y[temperature] -y[humidity] -l
"temperature": 25.1
"humidity": 40.0
bash $

in the example above drop -l if you don't want printed labels

score 0 · Answer 6 · answered Jul 12 '21 at 07:21

To get the temperature and humidity from each message with id 1490, as a tab-delimited list, you may use

jq -r '.message | select(.id == 1490) | [ .temperature, .humidity ] | @tsv'

Output given the data in the question:

25.1    40

To get CSV output, with the addition of a header, use

jq -s -r '[ "temperature", "humidity" ], (.[] | .message | select(.id == 1490) | [ .temperature, .humidity ]) | @csv'

Note the added -s here to use jq in "slurp mode". It reads all the objects in the input set into a single array and we use this to first give the @csv operator the CSV header as an array, and then a set of arrays containing the individual CSV records that we have extracted from the data.

Output given the data in the question:

"temperature","humidity"
25.1,40

How to extract data from a JSON file

6 Answers6

Linked

Related