Python REGEX

^      Matches the beginning of a line

$      Matches the end of the line 

.      Matches any character

\s      Matches whitespace

\S      Matches any non-whitespace character

*      Repeats a character zero or more times 

*?      Repeats a character zero or more times (non-greedy)

+      Repeats a character one or more times

+?      Repeats a character one or more times (non-greedy)

[aeiou]       Matches a single character in the listed set 

[^XYZ]      Matches a single character not in the listed set 

[a-z0-9]      The set of characters can include a range

(      Indicates where string extraction is to start

)       Indicates where string extraction is to end

Found and extracted all the digits from the string.

import re
x = 'My 2 favorite numbers are 19 and 42'
y = re.findall('[0-9]+', x)
print(y)

Result:

['2', '19', '42']

Extracted the email from the string.

import re
lin = 'From: stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008'
y = re.findall(r'\S+@\S+', lin)  
print(y)

Result:

['stephen.marquard@uct.ac.za']

Extracted the domain from the email.

import re
lin = 'From: stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008'
y = re.findall('@([^ ]*)', lin)  
print(y)

Result:

['uct.ac.za']

To learn more: Regular expressions Python Documentation.

From the book Python for everybody

Compartilhe isso: