^ Matches the beginning of a line
$ Matches the end of the line
. Matches any character
\s Matches whitespace
\S Matches any non-whitespace character
* Repeats a character zero or more times
*? Repeats a character zero or more times (non-greedy)
+ Repeats a character one or more times
+? Repeats a character one or more times (non-greedy)
[aeiou] Matches a single character in the listed set
[^XYZ] Matches a single character not in the listed set
[a-z0-9] The set of characters can include a range
( Indicates where string extraction is to start
) Indicates where string extraction is to end
Found and extracted all the digits from the string.
import re
x = 'My 2 favorite numbers are 19 and 42'
y = re.findall('[0-9]+', x)
print(y)
Result:
['2', '19', '42']
Extracted the email from the string.
import re
lin = 'From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008'
y = re.findall(r'\S+@\S+', lin)
print(y)
Result:
['stephen.marquard@uct.ac.za']
Extracted the domain from the email.
import re
lin = 'From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008'
y = re.findall('@([^ ]*)', lin)
print(y)
Result:
['uct.ac.za']
To learn more: Regular expressions Python Documentation.
From the book Python for everybody