Read numbers from control file and Extract matching line numbers from the data file

Question

I have a control file - cntl.txt

2
3
5

Data file - data.txt

red
blue
yellow
green
violet
orange

I need to reading the lines matching from control file, here the output expected is:

blue
yellow 
violet

andreatsh · Answer 1 · 2016-11-02T22:47:17.020

5

Example of a very inefficient solution:

for i in $(<control.txt); do awk -v c=$i 'NR~c{ print $0 }' data.txt; done;

I report also a good solution I learned tonight:

awk 'FNR==NR{ z[$0]++;next }; FNR in z' control.txt data.txt

edited Nov 02 '16 at 22:47

answered Nov 02 '16 at 21:21

andreatsh

1,985
1
14
15

What @don_crissti said is true. Also see [Why is using a shell loop to process text considered bad practice?](http://unix.stackexchange.com/q/169716/135943) – Wildcard Nov 02 '16 at 22:11
@don_crissti You're right. I knew it's very inefficient, but unfortunately it's the best that I thought. I still have a lot to learn, I really appreciate your help! – andreatsh Nov 02 '16 at 22:28
Your `awk` solution is certainly worth an upvote imho (especially since the OP tagged `awk`) – steeldriver Nov 02 '16 at 23:14
1

note that `z[$0]` alone will do... no need of `++`... – Sundeep Jun 20 '17 at 14:44

Wildcard · Answer 2 · 2016-11-02T22:05:53.800

4

Using only POSIX specified features of Sed:

sed -n -e "$(sed '/./s/$/p/' cntl.txt)" data.txt

Of course if your cntl.txt file has lines besides numbers, you may get an error. But if it has empty lines these will be handled correctly (i.e. they will not affect the output).

edited Nov 02 '16 at 22:05

answered Nov 02 '16 at 21:52

Wildcard

35,316
26
130
258

This does not work if you have 10+ lines, as join expects a lexicographically sorted data. – zeppelin Nov 02 '16 at 22:04
@zeppelin, you are quite correct. I've upvoted your answer and have now removed the incorrect portion of this answer. – Wildcard Nov 02 '16 at 22:06
I've upvoted your updated answer. – zeppelin Nov 02 '16 at 22:07

zeppelin · Answer 3 · 2016-11-02T21:33:36.760

3

Try this:

join <(nl data.txt|sort -k1b,1) <(cat cntl.txt|sort -k1b,1) | sort -nk1,1 | cut -d' ' -f2-

nl - will enumerate lines for you

 1  red
 2  blue
 3  yellow
 4  green
 5  violet
 6  orange

| sort -k1b,1 - will sort them by the line number (first field), lexicographically

cat cntl.txt| sort -k1b,1 - will sort the control file in the same order

2
3
5

join <() <() - will join the sorted (and numbered) "data" with the sorted "control", on the first field (i.e. line number)

2 blue
3 yellow
5 violet

|sort -nk1,1 - will re-sort the results numerically (to put the lines back in order)

| cut -d' ' -f2- - will drop the line number field

blue
yellow
violet

edited Nov 02 '16 at 21:33

answered Nov 02 '16 at 21:21

zeppelin

3,782
10
21

1

> That seems silly That is not silly, as join operates on the lexicographically and not numerically sorted data. – zeppelin Nov 02 '16 at 21:58
> Also, <(...) is a Bash-ism, not specified by POSIX Yep, this is probably not canonicaly POSIX, but it is not a bash-only extension either, at least _ksh_ and _zsh_ do support this too. – zeppelin Nov 02 '16 at 22:03

Ipor Sircer · Answer 4 · 2016-11-02T22:14:48.450

2

With sed only:

sed -n "$(sed -e 's/$/p;/' < cntl.txt)" data.txt

edited Nov 02 '16 at 22:14

answered Nov 02 '16 at 21:41

Ipor Sircer

14,376
1
27
34

1

Good, but no reason to use `tr`. Newlines are a perfectly acceptable command separator for Sed. (Also, `$(...)` is the preferred syntax for command substitution.) – Wildcard Nov 02 '16 at 22:10

score -1 · Answer 5 · answered Nov 02 '16 at 21:49

-1

Another possible solution:

IFS=$'\n' read -d '' -r -a colors < 'data.txt'; unset IFS;

for i in $(<cntl.txt); do
        echo ${colors[i-1]} 
done

IFS line sets up internal file separator as newline and inserts each line from data.txt into array. After that you loop through lines in cntl.txt and print array elements with given index from it (minus 1 because you start your data.txt from 1, not from 0, otherwise it would be unnecessary).

answered Nov 02 '16 at 21:49

Nemanja Martinovic

99
1

See [Why is using a shell loop to process text considered bad practice?](http://unix.stackexchange.com/q/169716/135943) – Wildcard Nov 02 '16 at 22:11

Read numbers from control file and Extract matching line numbers from the data file

5 Answers5