4

I have 2 files: args and text. For example:

args: [contains arguments of a script]

life
happy
horse

text:

The horse has a happy life.
Life is fun.
Kids are happy.

I need a command that prints the lines from text that contain all the patterns from args. In this case: The horse has a happy life.

mike-m1342
  • 41
  • 1
  • 4
  • How many different `args` are there? – shadowtalker Dec 22 '16 at 13:03
  • A line must match _all_ of the lines in the `args` file to be printed? Or _any_ of the lines? Either way, `grep` may be a useful command for you to investigate (`man grep`) – roaima Dec 22 '16 at 13:12
  • 1
    Must match _all_ the lines in order to be printed. – mike-m1342 Dec 22 '16 at 13:13
  • @don_crissti that's a good candidate but it doesn't really address the unknown number of patterns in the file `args`. – roaima Dec 22 '16 at 13:31
  • 1
    @roaima - I see it as a dupe, the unknown number of pattern is a secondary Q but not the main Q (and should prolly be a separate Q) – don_crissti Dec 22 '16 at 13:36
  • 1
    @roaima - not to mention that [one of the posts there](http://unix.stackexchange.com/a/270700) actually _does answer this question_. – don_crissti Dec 22 '16 at 13:45
  • The [other question](http://unix.stackexchange.com/q/55359/21471) is specific to `grep`, this isn't and it operates on the files as input, so the questions are a bit different, they could lead to different answers. – kenorb Dec 22 '16 at 14:44

4 Answers4

4

git grep

Here is the syntax using git grep combining multiple patterns using Boolean expressions:

git grep -e "life" --and -e "happy" --and -e "horse"

The above command will print lines matching all the patterns at once.

If the files aren't under version control, add --no-index param.

Search files in the current directory that is not managed by Git.

Check man git-grep for help.

grep

Normally grep with -f parameter will print lines with at least one pattern, e.g.

grep -f args.txt file.txt

In your case it won't work.

So to print lines which matches all patterns at the same time, you can try this command:

while read n text; do [ $n -eq $(wc -l < args.txt) ] && echo $text; done < <(while read patt; do grep "$patt" text.txt; done < args.txt | sort | uniq -c)

Explanation:

  1. The inner while loop will print all lines which matches at least one pattern in text.txt using pattern list from args.txt file.
  2. Then this list is sorted (sort) and counted for number of occurrences (uniq -c).
  3. The outer while loop will print only lines which have the same number of occurrences that number of patterns in args.txt (which is 3).

Another approach would be to remove all lines which does not match at least one pattern.

Here is the solution using Ex/Vim editor changing the file in-place:

while read patt; do ex +"v/$patt/d" -scwq text.txt; done < args.txt

Note: This will remove the unneeded lines from the file it-self.

Here is shorter version which will print the result on the screen only:

ex $(xargs -I% printf "+v/%/d " < args.txt) +%p -scq! text.txt

Change +%p -scq! to -scwq to save it in-place into the file.


And here is the solution by defining a shell alias:

alias grep-all="</dev/stdin $(xargs printf '|grep "%s"' < args.txt)"

Sample usage:

grep-all file.txt

Related:

kenorb
  • 20,250
  • 14
  • 140
  • 164
1

Use awk.

~$ cat textlist.txt | awk '/life/ && /happy/ && /horse/ { print; }'
The horse has a happy life

textlist.txt being the list of sentences you gave.

Iskar
  • 351
  • 1
  • 6
  • 2
    `awk` knows how to read files, no need for `cat`, and I suspect that the OP wants to be able to read the patterns from a file at runtime – Eric Renouf Dec 22 '16 at 13:09
  • `args` contains the arguments of a script. I'm not supposed to know what `args` contains. – mike-m1342 Dec 22 '16 at 13:10
  • @mike-m1342 is `args` inside a flat file or are they actual script arguments? If they are script arguments, you can just use the positional parameters in awk ($1, $2, etc). Hopefully that helps, but perhaps I still don't fully understand your goal. – Iskar Dec 22 '16 at 13:16
  • `args` is a file. But I placed there the script arguments. How to use `awk` with all arguments at once? – mike-m1342 Dec 22 '16 at 13:22
1

You can use agrep to apply the AND operation. Unfortunately it can't apply AND to a list of patterns in a file, so you have to expand that into a command argument first.

patterns=$(sed 's/;/\\;/' <args | tr '\n' ';' | sed 's/;$//')
agrep "$patterns" text

The munging to generate the list of patterns joins them together with semicolons. However, we must not have a semicolon at the end of the list, so that gets removed, and also any pattern already containing a semicolon must have that character escaped so it's not treated as an AND operator. Use echo "$patterns" to see how the args file is converted.

roaima
  • 107,089
  • 14
  • 139
  • 261
0

Perhaps this is not the most efficient way of doing this, but it seem to work:

# for a in $(cat args); do regex="$regex | grep '$a'"; done
# eval cat text "$regex"
The horse has a happy life.
NarūnasK
  • 2,276
  • 4
  • 25
  • 35