Question: http://oj.leetcode.com/problems/regular-expression-matching/
Question Name: Regular Expression Matching
Year ago, at the first time I read the book “Beautiful Code”, I was completely shocked by the amazingly beautiful regular expression matcher. It is elegant, simple, and really beautiful. As that book saying:
Beautiful code is likely to be simple — clear and easy to understand. Beautitful code is likely to be compact — just enough code to do the job and no more — but not cryptic, to the point where it cannot be understood. Beautiful code may well be general, solving a broad class of problems in a uniform way. One might even describe it as elegant, showing good taste and refinement.
by Brian Kernighan
Every software engineer should read it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | class Solution: # @return a pre-processed pattern string # Make the pattern compact. For example, we will shorten "a*.*" and "a*a*" into # ".*" and "a*" respectively. def _compile(self, p): result = "" while len(p) != 0: if len(p) >= 2 and p[1] == "*": # Meet with a new *, without * neighbors if len(result) < 2 or result[-1] != "*": result += p[0:2] # Meet with a same * segment as previous one elif result[-2] == p[0] or result[-2] == ".": pass # Reduce all * segments around .* elif p[0] == ".": while len(result) >= 2 and result[-1] == "*": result = result[:-2] result += p[:2] # Meet with a different * segment else: result += p[0:2] p = p[2:] else: result += p[0] p = p[1:] return result # @return True if s matches p completely # False else. def _isMatch(self, s, p): # Both string and pattern reach the end. Nothing left to deal with. if len(s) == 0 and len(p) == 0: return True # Pattern ends. But string left some substring unmatched elif len(p) == 0: return False # Meet with a * segment in pattern elif len(p) >= 2 and p[1] == "*": return self._isMatchStar(s, p[2:], p[0]) # String ends. But pattern remains some part. And the remaining part is not # a * segment. In other words, the remaining part DO need some subtring to # match. But we do not have any more. elif len(s) == 0: return False # Single character matches. Recursively check the next position. elif s[0] == p[0] or p[0] == ".": return self._isMatch(s[1:], p[1:]) # Single character fails to match. else: return False # @return True if s matches (ch)*p completely # False else. def _isMatchStar(self, s, p, ch): # Works like a DFS algorithm while True: # Simply try to skip the * segment. That is,the * segment does not consume # any thing in this round. if self._isMatch(s, p): return True # The * segment would like to, but has noting to consume. elif len(s) == 0: return False # This * segment consume one quanlified character in the string elif s[0] == ch or ch == ".": s = s[1:] # This * segment would like to, but has no quanlified content to consume. else: return False def isMatch(self, s, p): assert p == "" or p[0] != "*" return self._isMatch(s, self._compile(p)) |