1

I have a question and wondering how to address it with sed? How does one cut a variable up to a specific location, in my case _g?

Example

variable="This_is_good_g0r0s0_continues"

I need to cut this variable up to _g. I should also mention, the number of characters before _g is random.

jimmij
  • 46,064
  • 19
  • 123
  • 136
Uma
  • 11
  • 1

3 Answers3

2

You can do that with internal variable expansion operators in all POSIX shells :

variable="This_is_good_g0r0s0_continues"
up_to_first__g="${variable%%_g*}"
up_to_last__g="${variable%_g*}"
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • 1
    Although it is not important for this question, but as a good practise "please quote your variables on your answers on this site!". http://unix.stackexchange.com/questions/171346/security-implications-of-forgetting-to-quote-a-variable-in-bash-posix-shells. Sorry, couldn't resist ;) – jimmij Dec 03 '14 at 22:55
  • @jimmij. You're right. Sounds like I'll have to fix my bad habits. – Stéphane Chazelas Dec 03 '14 at 22:58
  • 1
    @jimmij [The right-hand side of an assignment is implicitly double-quoted](http://unix.stackexchange.com/questions/68694/when-is-double-quoting-necessary/68748#68748). But it is a arguably pedagogically sound to quote anyway when posting answers here (especially because the quotes are necessary in seemingly similar cases such as `export foo="$bar"`). – Gilles 'SO- stop being evil' Dec 04 '14 at 01:27
0

If you want to cut at last _g then try the following:

$ sed 's/\(.*\)_g.*/\1/' <<< 'This_is_good_g0r0s0_continues'
This_is_good

If you want to cut at first _g then

$ sed 's/_g.*//' <<< 'This_is_good_g0r0s0_continues'
This_is
jimmij
  • 46,064
  • 19
  • 123
  • 136
0

With a BRE - with sed - you can slice match sequences by counting their occurrence. I do not think that sed would be an ideal tool for slicing shell var values - the shell provides a fairly intuitive means of doing that already - but that is already covered.

BRE pattern sequences when multiplied will split to the right as far as they might. For example:

echo 0123456789 | sed 's/\([0-9]\)*/\1/'

...prints 9. It gets a little more useful when you split with actual counts though.

Another example:

echo _good _goroso _goes _gop |
sed 's/\(.*_g\)\{2\}/\1/'

...which gets...

oes _gop

This wont work like you might think, though - or at least it doesn't work like I expected when I started playing with it. You can't go back another _g there with \{3\} - not in this case anyway. The pattern is too variable .* matches everything - including nothing - and it is therefore difficult to easily quantify. What you can do is continue to split it:

echo _good _goroso _goes _gop |
sed 's/\(\(.*_g\)\{2\}\)\{2\}/\1/'

...which prints

oroso _goes _gop

Perhaps it is better to say squeeze than it is split. Here - by matching at least so many occurrences of a .* zero-or-more occurrence pattern I effectively limit its possible match - which will always be as greedy as it might. So I slice smallest piece by smallest piece off of the end of what it matches.

This is more easily seen by looking at the match itself - which can still be had in & in addition to the match that is had already in \1. I've often found such eccentricities useful, personally, though i wouldn't be surprised to learn I am alone in that. You might, for example, compound the match from each end like:

echo _good _goroso _goes _gop |
sed 's/\(\(.*_g\)\{2\}\)\{2\}/&\1/

...which prints :

_good _goroso _goes _goroso _goes _gop

...because the entire match for .*_g is in & but the subdivided pattern is only a portion.

...or even...

echo _good _goroso _goes _gop |
sed 's/\(\(.*_g\)\{2\}\)\{2\}/\2&\1/

...which shuffles every level of the match like...

oes _g_good _goroso _goes _goroso _goes _gop
mikeserv
  • 57,448
  • 9
  • 113
  • 229