|  | How can we get to the end of a quote inside a string |  | |
| | | Guest |  |
| Posted: Sun Aug 31, 2008 2:29 pm Post subject: How can we get to the end of a quote inside a string |  |
Hi all, Suppose I have a string which contains quotes inside quotes - single and double quotes interchangeably - s = "a1' b1 " c1' d1 ' c2" b2 'a2" I need to start at b1 and end at b2 - i.e. I have to parse the single quote strings from inside s.
Is there an existing string quote parser which I can use or should I write a parser myself?
If somebody could help me on this I would be much obliged.
Regards kR/\/ |
| |
| | | Wojtek Walczak |  |
| Posted: Sun Aug 31, 2008 5:20 pm Post subject: Re: How can we get to the end of a quote inside a string |  |
On Sun, 31 Aug 2008 07:29:26 -0700 (PDT), rajmohan.h@gmail.com wrote:
| Quote: | Suppose I have a string which contains quotes inside quotes - single and double quotes interchangeably - s = "a1' b1 " c1' d1 ' c2" b2 'a2"
s = "a1' b1 " c1' d1 ' c2" b2 'a2" File "<stdin>", line 1 |
s = "a1' b1 " c1' d1 ' c2" b2 'a2" ^ SyntaxError: invalid syntax
Writing a small parser for your needs shouldn't be that hard. To some extent you can use regular expressions:
| Quote: | re.findall(re.compile("\".*?\""), s) ['" c1\' d1 \' c2"'] re.findall(re.compile("\'.*?\'"), s) ['\' b1 " c1\'', '\' c2" b2 \'']
|
but it won't work in all cases. You can read more here: LINK
-- Regards, Wojtek Walczak, LINK |
| |
| | | Antoon Pardon |  |
| Posted: Tue Sep 02, 2008 11:54 am Post subject: Re: How can we get to the end of a quote inside a string |  |
On 2008-08-31, rajmohan.h@gmail.com <rajmohan.h@gmail.com> wrote:
| Quote: | Hi all, Suppose I have a string which contains quotes inside quotes - single and double quotes interchangeably - s = "a1' b1 " c1' d1 ' c2" b2 'a2" I need to start at b1 and end at b2 - i.e. I have to parse the single quote strings from inside s.
Is there an existing string quote parser which I can use or should I write a parser myself?
If somebody could help me on this I would be much obliged.
|
You could use a combination of split and join in this case.
#use a single quote as a seperator to split the string is a list of substrings ls = s.split("'")
#remove what comes before the first and after the last single quote ls = ls[1:-1]
#reassemble the string between the outermost single quotes. s = "'".join(ls)
#strip spaces in front and after if you wish s = s.strip()
-- Antoon Pardon |
| |
| | | Paul McGuire |  |
| Posted: Tue Sep 02, 2008 2:51 pm Post subject: Re: How can we get to the end of a quote inside a string |  |
| |  | |
On Aug 31, 9:29 am, rajmoha...@gmail.com wrote:
| Quote: | Hi all, Suppose I have a string which contains quotes inside quotes - single and double quotes interchangeably - s = "a1' b1 " c1' d1 ' c2" b2 'a2" I need to start at b1 and end at b2 - i.e. I have to parse the single quote strings from inside s.
|
Pyparsing defines a helper method called nestedExpr - typically it is used to find nesting of ()'s, or []'s, etc., but I was interested to see if I could use nestedExpr to match nested ()'s, []'s, AND {}'s all in the same string (like we used to do in our algebra class to show nesting of higher levels than parens - something like "{[a + 3*(b-c)] + 7}" - that is, ()'s nest within []'s, and []'s nest within {}'s). This IS possible, but it uses some advanced pyparsing methods. I adapted this example to map to your case - this was much simpler, as ""s nest within ''s, and ''s nest within ""s. I still keep a stack of previous nesting, but I'm not sure this was absolutely necessary. Here is the working code with your example:
from pyparsing import Forward, oneOf, NoMatch, Literal, CharsNotIn, nestedExpr
# define special subclass of Forward, that saves previous contained # expressions in a stack class ForwardStack(Forward): def __init__(self): super(ForwardStack,self).__init__() self.exprStack = [] self << NoMatch() def __lshift__(self,expr): self.exprStack.append(self.expr) super(ForwardStack,self).__lshift__(expr) return self def pop(self): self.expr = self.exprStack.pop()
# define the grammar opening = ForwardStack() closing = ForwardStack() opening << oneOf(["'", '"']) closing << NoMatch() matchedNesting = nestedExpr(opening, closing, CharsNotIn('\'"'), ignoreExpr=None)
# define parse-time callbacks alternate = {'"':"'", "'":'"'} def pushAlternate(t): # closing expression should match the current opening quote char closing << Literal( t[0] ) # if we find the other opening quote char, it is the beginning of # a nested quote opening << Literal( alternate[ t[0] ] ) def popClosing(): closing.pop() opening.pop() # when these expressions match, the parse action will be called opening.setParseAction(pushAlternate) closing.setParseAction(popClosing)
# parse the test string s = """ "a1' b1 " c1' d1 ' c2" b2 'a2" """
print matchedNesting.parseString(s)[0]
Prints:
['a1', [' b1 ', [' c1', [' d1 '], ' c2'], ' b2 '], 'a2']
-- Paul |
| |
|
|