Mastering Javascript Fundamentals: Regular Expressions
Get the fundamentals down and the level of everything you do will rise. - Michael Jordan
As stated in my original post, I do 1 hour of video lessons from Watch and Code every day. If you're interested in learning Javascript in a way that goes beyond basic tutorials and gives you a foundational, practical knowledge without relying on frameworks - I'd highly recommend it. If you're reading these posts, please keep in mind that these are just my notes, and I'm not an expert (yet!). If your goal is also to master the fundamentals of Javascript, please head over to Watch and Code and start your journey there!
All screenshots were annotated using Shotty.
Regular Expressions
A concise way to look for patterns in strings.
Pretty much the same in all programming language.
Goal of this exercise is to get comfortable enough to figure out regular expressions when they come up.
How do we figure out which passwords are strong which ones are weak?
Grading passwords with regular expressions:
- Number of total characters
- Number of lowercase letters
- Number of uppercase letters
- Number of digits
- Number of special characters
- Number of "words" (consecutive letters)
Some basic regex examples to get started:
The last expression is not a sustainable way to detect all characters, since you'd have to do a lot of typing, and account for both uppercase and lowercase, other characters, spaces, etc.
Luckily there's a very concise way to do this in regex:
This matches our first requirement on our list:
- Number of total characters
- Number of lowercase letters
- Number of uppercase letters
- Number of digits
- Number of special characters
- Number of "words" (consecutive letters)
These shortcuts, such as using . to represent all characters, are called "meta characters".
Only use regular expressions when it's a lot easier than the alternative.
Compare these:
Not the best use of regular expressions to get the length of this string, but a good introduction to how they work.
Let's look at the next one of our requirements:
Number of lowercase letters
But we can do better. We can use something called ranges. Ranges only work in sets (the square brackets, []). We express a range with a dash.
So that satisfies requirement #2 of counting the number of lowercase letters.
Without regular expressions this would be difficult.
Let's look at our next requirement:
Number of uppercase characters
We can use a range in the same way we did for lowercase letters, but we just need to change them to uppercase. Easy.
Number of digits
Again, we can use a range here to do this easily:
Number of special characters
(Anything that is not a letter or a digit).
Remember that we had this regular expression that gave us anything that is letter or a digit:
If we just had a way to say that we want anything that is not this expression, we'd be in good shape. Well lucky for us, that exists: the caret (^).
The way to interpret the caret is: match anything that is not in the set that follows.
Let's try it out:
So the new concept from this requirement is the negation operator (^) that we can use in sets. The way to use it is:
- Create a set with []
- Define the characters you don't want
- Put the negation operator ^ in the front within the []
Number of "words" (consecutive letters)
Let's start by using what we know so far. We know we want to match letters, so that's a good starting point. We'll use a set with 2 ranges: lowercase and uppercase letters:
The problem with this regular expression is that it only matches single characters.
We want to match consecutive characters.
Luckily regular expressions have a concise way to look for consecutive characters. The feature is called quantifiers.
The way it works is: quantifiers are used to look for consecutive matches of any length. If you have 3 letters in a row it will match those 3 letters in a row. It will match as many as you want.
Let's say we're looking for letters that happen consecutively, 1 through 20 times. This should detect all the sets of consecutive letters ("words") in our string:
This is nice, but it's a bit of a hack. What if there is a word longer than 20 characters. We need a way to tell it to go to infinity. To do this, we simply leave out the upper limit:
Let's look at a few more things:
But there a couple of twists that are common:
Wanting to match one or more consecutive matches is so common, that there is a shortcut: the plus sign, +.
The second one is more common.
The last thing to note is that it's so common to want to find letters, it can be convenient to not need to specify two ranges.
We can just give one range, a-z or A-Z (it doesn't matter). It will match both if we add the case insensitive flag, i.
Summary
- Regular expressions let us look for matches in strings.
- Specify options using a concise syntax.
- Allows us to do certain things much more easily than in Javascript.
- Put the thing we want to match in between // in 'string'.match(/s/)
- . will match any character: 'string'.match(/./)
- The global flag, g, will give us multiple matches (find all): 'strings'.match(/s/g) // ["s", "s"]
- The pipe | will look for a or b: 'abc'.match(/a|b/g).
- Specify or by putting our matches in brackets: 'abc'.match(/[abc]/g)
- Specify ranges with the - symbol: 'abc'.match(/[a-z]/g)
- Combine ranges: 'abcDEF123'.match(/[a-zA-Z0-9]/g)
- Use the negation operator, ^, to get things that don't match our specified values: 'abc123'.match(/[^a-z]/g) // [123]
- Match consecutive characters rather than single characters, using quantifiers {}: 'string'.match(/[a-z]{1, 20}/g)
- Leave of the upper limit to go to infinity
- 'string'.match(/[a-z]{1,}/g)
- Or just type: 'string'.match(/[a-z]+/g)
- Match uppercase or lowercase by adding the case insensitive flag, i: 'string'.match(/[a-z]/gi)