12

I have a text file on Linux where the contents are like below:

help.helloworld.com:latest.world.com
dev.helloworld.com:latest.world.com

I want to get the contents before the colon like below:

help.helloworld.com
dev.helloworld.com

How can I do that within the terminal?

Joel Deleep
  • 239
  • 3
  • 11
  • 2
    The `grep` utility is used for looking for lines matching regular expressions. You could possibly use it here, but it would be more appropriate to use a tool that extracts data from fields given some delimiter, such as the `cut` utility. – Kusalananda Aug 27 '19 at 17:23
  • I've submitted an edit to take out the word "grep" and replace it with "find" in the title and "get" in the question body, to avoid the X/Y issue of assuming `grep` is the right tool to solve the actual problem. – Monty Harder Aug 28 '19 at 18:21
  • 1
    All I can say is that the contents before the colon is much better than the contents after the colon ;-). – Peter - Reinstate Monica Aug 30 '19 at 14:02

7 Answers7

40

This is what cut is for:

$ cat file
help.helloworld.com:latest.world.com
dev.helloworld.com:latest.world.com
foo:baz:bar
foo

$ cut -d: -f1 file
help.helloworld.com
dev.helloworld.com
foo
foo

You just set the delimiter to : with -d: and tell it to only print the 1st field (-f1).

terdon
  • 234,489
  • 66
  • 447
  • 667
20

Or an alternative:

$ grep -o '^[^:]*' file
help.helloworld.com
dev.helloworld.com

This returns any characters beginning at the start of each line (^) which are no colons ([^:]*).

Freddy
  • 25,172
  • 1
  • 21
  • 60
19

Would definitely recommend awk:

awk -F ':' '{print $1}' file

Uses : as a field separator and prints the first field.

Centimane
  • 4,420
  • 2
  • 21
  • 45
5

updated answer

Considering the following file file.txt:

help.helloworld.com:latest.world.com
dev.helloworld.com:latest.world.com
no.colon.com
colon.at.the.end.com:

You can use sed to remove everything after the colon:

sed -e 's/:.*//' file.txt

This works for all the corner cases pointed out in the comments—if it ends in a colon, or if there is no colon, although these weren't mentioned in the question itself. Thanks to @Rakesh Sharma, @mirabilos, and @Freddy for their comments. Answering questions is a great way to learn.

kGdmioT
  • 205
  • 1
  • 6
  • 4
    `sed -e 's/:.*//' file.txt` is another way with Posix sed. – Rakesh Sharma Aug 28 '19 at 04:02
  • 1
    `sed -ne 'y/:/\n/;P' file.txt` also can be used. – Rakesh Sharma Aug 28 '19 at 04:05
  • Make `.+` to `.*` – Rakesh Sharma Aug 28 '19 at 04:37
  • @Randy Joselyn Since there's an implicit `if` in the `s///p` syntax, you need to modify your regex to take care of lines with no colons, something like, `sed -nEe 's/([^:]*)(:.*|)/\1/p'`. Note this requires `GNU sed` but since anyway you are on GNU sed so this shouldn't matter. – Rakesh Sharma Aug 28 '19 at 05:05
  • This answer could have been my favourite, but the ERE are unnecessary. `sed -n '/:/s/^\([^:]*\):.*$/\1/p` (add `--posix` if you use GNU sed, just to spite the extensionism of theirs) – mirabilos Aug 28 '19 at 18:09
4

Requires GNU grep. It would not work with the default grep on e.g. macOS or any of the other BSDs.

Do you mean like this:

grep -oP '.*(?=:)' file

Output:

help.helloworld.com
dev.helloworld.com
schrodingerscatcuriosity
  • 12,087
  • 3
  • 29
  • 57
  • 4
    If there are two or more colons on the line, this will print everything until the last one, so not what the OP needs. Try `echo foo:bar:baz | grep -oP '.*(?=:)'`. This will work for the OP's example, but not for the general case as described in the question. – terdon Aug 27 '19 at 17:19
  • there is only one colon and its working fine , but thanks for the update – Joel Deleep Aug 27 '19 at 17:25
-2

You could achieve this with bash string handling, by removing the longest match from the string directly for each line read like so:

for line in $(cat inputfile); do echo "${line%%:*}"; done

This might be a useful alternative if you are parsing the file in a shell script (though I suspect using cut might be more efficient).

  • 1
    please read [Why is using a shell loop to process text considered bad practice?](https://unix.stackexchange.com/q/169716/72456) – αғsнιη Aug 31 '19 at 08:29
-2

In pure POSIX shell without using external commands, I'd do:

#/bin/sh
IFS=:
while read -r a _; do
  echo "$a"
  done < file.txt
unset IFS
Léa Gris
  • 397
  • 4
  • 6
  • 1
    please read [Why is using a shell loop to process text considered bad practice?](https://unix.stackexchange.com/q/169716/72456) – αғsнιη Aug 31 '19 at 08:30