logo资料库

python正则表达式速查表.pdf

第1页 / 共1页
资料共1页,全文预览结束
LEARN DATA SCIENCE ONLINE Start Learning For Free - www.dataquest.io Data Science Cheat Sheet Python Regular Expressions SPECIAL CHARACTERS ^ | Matches the expression to its right at the start of a string. It matches every such instance before each \n in the string. $ | Matches the expression to its left at the end of a string. It matches every such instance before each \n in the string. . | Matches any character except line terminators like \n. \ | Escapes special characters or denotes character classes. A|B | Matches expression A or B. If A is matched first, B is left untried. + | Greedily matches the expression to its left 1 or more times. \A | Matches the expression to its right at the absolute start of a string whether in single or multi-line mode. \Z | Matches the expression to its left at the absolute end of a string whether in single or multi-line mode. SETS [ ] | Contains a set of characters to match. [amk] | Matches either a, m, or k. It does not match amk. [a-z] | Matches any alphabet from a to z. [a\-z] | Matches a, -, or z. It matches - because \ escapes it. * | Greedily matches the expression to its left [a-] | Matches a or -, because - is not being 0 or more times. ? | Greedily matches the expression to its left 0 or 1 times. But if ? is added to qualifiers (+, *, and ? itself) it will perform matches in a non-greedy manner. {m} | Matches the expression to its left m times, and not less. {m,n} | Matches the expression to its left m to n times, and not less. {m,n}? | Matches the expression to its left m times, and ignores n. See ? above. CHARACTER CLASSES (A.K.A. SPECIAL SEQUENCES) \w | Matches alphanumeric characters, which means a-z, A-Z, and 0-9. It also matches the underscore, _. \d | Matches digits, which means 0-9. \D | Matches any non-digits. \s | Matches whitespace characters, which include the \t, \n, \r, and space characters. \S | Matches non-whitespace characters. \b | Matches the boundary (or empty string) at the start and end of a word, that is, between \w and \W. \B | Matches where \b does not, that is, the boundary of \w characters. used to indicate a series of characters. [-a] | As above, matches a or -. [a-z0-9] | Matches characters from a to z and also from 0 to 9. [(+*)] | Special characters become literal inside a set, so this matches (, +, *, and ). [^ab5] | Adding ^ excludes any character in the set. Here, it matches characters that are not a, b, or 5. GROUPS ( ) | Matches the expression inside the parentheses and groups it. (?) | Inside parentheses like this, ? acts as an extension notation. Its meaning depends on the character immediately to its right. (?PAB) | Matches the expression AB, and it can be accessed with the group name. (?aiLmsux) | Here, a, i, L, m, s, u, and x are flags: a — Matches ASCII only i — Ignore case L — Locale dependent m — Multi-line s — Matches all u — Matches unicode x — Verbose (?:A) | Matches the expression as represented by A, but unlike (?PAB), it cannot be retrieved afterwards. (?#...) | A comment. Contents are for us to read, not for matching. A(?=B) | Lookahead assertion. This matches the expression A only if it is followed by B. A(?!B) | Negative lookahead assertion. This matches the expression A only if it is not followed by B. (?<=B)A | Positive lookbehind assertion. This matches the expression A only if B is immediately to its left. This can only matched fixed length expressions. (?
分享到:
收藏