Understand about REGEX (Regular Expression)

In this tutorial, I will show help you to understand the syntax and how to write a Regular Expression.

Notes:

  • Although Regex can work with both ASCII and UNICODE characters. In this article, we will focus on ASCII only.

1. The Dot (.) – Any Character

The . (dot) character can match any single character (letter, digit, whitespace, everything)

Example Regex: ..C1
ABC1 (match)
DEF  (Not match)
XYC2 (Not match)

This will overrides the matching of the period character, so in order to specifically match a period, we have to escape the dot by using a slash \. accordingly.

Example Regex: ..\.1
AB.1  (match)
DE.2  (Not match)
XY3   (Not match)

2. \d – Any digit from 0 to 9

Example: ab\d\d
ab12 (match)
ab99 (match)
abcd (Not match)

P/S: \D is non-digit character

3. [abc] – square brackets – only match a single a, b, or c letter

Example Regex: [cmf]an
can (match)
man (match)
fan (match)
yan (Not match)

4. [^abc] – hat – only match any single character except for a, b, or c letter

Example Regex: [^cmf]an
can (Not match)
man (Not match)
fan (Not match)
yan (match)

5. [a-c] – to match a character that can be in a sequential range without listing out

Examples:
[0-6] match from zero to six
[^0-6] match any single character except from zero to six
[a-z] match all characters from a to z (lowercase)
[A-Z] match all characters from A to Z (uppercase)
[A-Za-z0-9_] match characters in English text

6. a{3} OR a{1,3} to match the number of repetitions of characters

Example Regex: Wakeu{1,3}p
Wakeup     (match)
Wakeuup    (match)
Wakeuuup   (match)
Wakeuuuup  (Not match)
Wakeuuuup  (Not match)

7. Kleene Star (*) – 0 or more repetitions AND the Kleene Plus (+) – 1 or more repetitions

Example Regex: a+b*c+
aaaabbcc (match)
aacccc   (match)
addd     (Not match)         

8. question mark (?) – characters optional

The pattern ab?c will match either the strings "abc" or "ac" because the b is considered optional

Similar to the dot metacharacter, the question mark is a special character and have to escape it using a slash \? to match a plain question mark character in a string.

9. Whitespace

( ) Space
(\t) Tab
(\n) New Line
(\r) Carriage return - Windows
--> \s will match any above whitespaces 
PS: \S is non-whitespace character

10. ^ (hat) for Starting and $ (dollar sign) for Ending

Example: ^Mission.
Mission1 (match)
Missionb (match)
SecondMission (Not match)

Example: 123$
abc123 (match)
xyz123 (match)
abc123xyz (Not match)

11. ( …) Match groups

^(file\.+)\.pdf$

12. ( a(bc)) Nested groups

(.*(\d{4}))

13. Match all

(.*)

14. (abc|def) matches abc or def

15. Summary


Comments

2 responses to “Understand about REGEX (Regular Expression)”

  1. […] [abc] Character classes and Ranges, same as Regular Expression […]

  2. This is Gregory Stewart from Wyoming. We’d like to offer your business a loan to kick off the new year, to use for whatever you need. We’re reaching out to a few local companies and I just wanted to see if we can help at all. Please take a look the details I put on our page here – https://cutt.ly/lwHyBuO7

    All the best,

    Gregory Stewart – Owner
    Fast Money Locator, LLC

Leave a Reply

Your email address will not be published. Required fields are marked *