|  | do a sed / awk filter with python tools (at least as fast) |  | |
| | | Mathieu Prevot |  |
| Posted: Mon Jul 07, 2008 3:32 pm Post subject: do a sed / awk filter with python tools (at least as fast) |  |
Hi,
I use in a bourne shell script the following filter:
sed '/watch?v=/! d;s/.*v=//;s/\(.\{11\}\).*/\1/' \ | sort | uniq | awk 'ORS=" "{print $1}'
that give me all sets of 11 characters that follows the "watch?v=" motif. I would like to do it in python on stdout from a subprocess.Popen instance, using python tools rather than sed awk etc. How can I do this ? Can I expect something as fast ?
Thanks, Mathieu |
| |
| | | Peter Otten |  |
| Posted: Mon Jul 07, 2008 6:04 pm Post subject: Re: do a sed / awk filter with python tools (at least as fas |  |
Mathieu Prevot wrote:
| Quote: | I use in a bourne shell script the following filter:
sed '/watch?v=/! d;s/.*v=//;s/\(.\{11\}\).*/\1/' \ | sort | uniq | awk 'ORS=" "{print $1}'
that give me all sets of 11 characters that follows the "watch?v=" motif. I would like to do it in python on stdout from a subprocess.Popen instance, using python tools rather than sed awk etc. How can I do this ? Can I expect something as fast ?
|
You should either do it in Python , e. g.:
def process(lines): candidates = (line.rstrip().partition("/watch?v=") for line in lines) matches = (c[:11] for a, b, c in candidates if len(c) >= 11) print " ".join(sorted(set(matches))) if __name__ == "__main__": import sys process(sys.stdin)
or invoke your shell script via subprocess.Popen(). Invoking a python script via subprocess doesn't make sense IMHO.
Peter |
| |
| | | Mathieu Prevot |  |
| Posted: Mon Jul 07, 2008 6:53 pm Post subject: Re: do a sed / awk filter with python tools (at least as fas |  |
2008/7/7 Peter Otten <__peter__@web.de>:
| Quote: | Mathieu Prevot wrote:
I use in a bourne shell script the following filter:
sed '/watch?v=/! d;s/.*v=//;s/\(.\{11\}\).*/\1/' \ | sort | uniq | awk 'ORS=" "{print $1}'
that give me all sets of 11 characters that follows the "watch?v=" motif. I would like to do it in python on stdout from a subprocess.Popen instance, using python tools rather than sed awk etc. How can I do this ? Can I expect something as fast ?
You should either do it in Python , e. g.:
def process(lines): candidates = (line.rstrip().partition("/watch?v=") for line in lines) matches = (c[:11] for a, b, c in candidates if len(c) >= 11) print " ".join(sorted(set(matches)))
if __name__ == "__main__": import sys process(sys.stdin)
or invoke your shell script via subprocess.Popen(). Invoking a python script via subprocess doesn't make sense IMHO.
|
Thanks. Mathieu |
| |
|
|