I was recently discussing with someone about sed usage. They were having difficultly creating an appropriate regex to handle their problem:
(Orlando) sed regex kicks my ass. I like to remove the second : in the
line. So that in 123:456:6789 it will only returns 123:456. I can find the
first :, but I have not been able to use s/:{2}// to find the second
one, and remove the rest.
I enjoy regex (I know I’m weird, leave me alone), so I was able to provide an answer to this problem:
(@znx) Orlando: the trick is to use [^:] and \1 ..
(@znx) like: s/\(:[^:]*\):.*/\1/
Now obviously at first glance this regex could be a bit intimating to someone who is still picking up the skills but as with most things if we break it down it becomes easier.
Working out to in, \( \):.*, that says match something with : at the end and all the character after it. The . is a special meaning “any character” and * to match multiple characters. The first match will be stored by sed and assigned into the \1 for the replacement (that is what the brackets do). Inside the brackets we have :[^:]*. The sequence [^ ] is a negated list, that means that we are asking it to match everything that is NOT inside the list, in this case :.
Putting it altogether we are saying: Match a leading : and a trailing : with any characters after it. Placing the contents between the two : in memory. Then finally we replace the contents.
% echo 123:456:6789 | sed 's/\(:[^:]*\):.*/\1/'
123:456
Sucess, however as with most things, there is more than one way to skin a cat and regex is rarely the prettiest method. So what other ways can we solve this problem?
With AWK:
% echo 123:456:6789 | awk -F: '{print $1":"$2}'
123:456
With cut:
% echo 123:456:6789 | cut -d: -f1,2
123:456
As always, experimentation with the mass of GNU tools you can find on your system will bring a greater deal of power to your tool chest. Mind you, then I wouldn’t get complements for helping would I?
(Orlando) Wow, When I grow up, I like to remember this thing like you do.
Haha, till next time!