In a bash or zsh script, how might I extract the
host from a url, e.g. unix.stackexchange.com from
http://unix.stackexchange.com/questions/ask, if the latter is in an environment variable?
Asked
Active
Viewed 7,101 times
7
Jeff Schaller
- 66,199
- 35
- 114
- 250
Toothrot
- 3,255
- 3
- 24
- 47
4 Answers
10
You can use parameter expansion, which is available in any POSIX compliant shell.
$ export FOO=http://unix.stackexchange.com/questions/ask
$ tmp="${FOO#*//}" # remove http://
$ echo "${tmp%%/*}" # remove everything after the first /
unix.stackexchange.com
A more reliable, but uglier method would be to use an actual URL parser. Here is an example for python:
$ python3 -c 'import sys; from urllib.parse import urlparse; print(urlparse(sys.argv[1]).netloc)' "$FOO"
unix.stackexchange.com
David Foerster
- 1,505
- 1
- 11
- 18
jordanm
- 41,988
- 9
- 116
- 113
5
If the URLs all follow this pattern I have this short and ugly hack for you:
echo "$FOO" | cut -d / -f 3
David Foerster
- 1,505
- 1
- 11
- 18
3
You can do it many ways, some of them being:
export _URL='http://unix.stackexchange.com/questions/ask'
echo "$_URL" | sed -ne 'y|/|\n|;s/.*\n\n/;P'
expr "$_URL" : 'http://\([^/]*\)'
echo "$_URL" | perl -lpe '($_) = m|^http://\K[^/]+|g'
perl -le 'print+(split m{/}, $ENV{_URL})[2]'
(set -f; IFS=/; set -- $_URL; echo "$3";)
Rakesh Sharma
- 770
- 4
- 4
-
Nice alternatives. +1. Though the sed solution has a small mistake; one slash is missing. should be `echo "$_URL" | sed -ne 'y|/|\n|;s/.*\n\n//;P'` or even better `echo "$_URL" | sed -ne 'y|/|\n|;s|.*\n\n||;P'` – George Vasiliou Feb 27 '17 at 23:27
2
Can be done also with regex groups:
$ a="http://unix.stackexchange.com/questions/ask"
$ perl -pe 's|(.*//)(.*?)(/.*)|\2|' <<<"$a"
unix.stackexchange.com
George Vasiliou
- 7,803
- 3
- 18
- 42