Jacob Ruiz

View Original

Mastering Javascript Fundamentals: Regular Expressions

Get the fundamentals down and the level of everything you do will rise. - Michael Jordan

As stated in my original post, I do 1 hour of video lessons from Watch and Code every day. If you're interested in learning Javascript in a way that goes beyond basic tutorials and gives you a foundational, practical knowledge without relying on frameworks - I'd highly recommend it. If you're reading these posts, please keep in mind that these are just my notes, and I'm not an expert (yet!). If your goal is also to master the fundamentals of Javascript, please head over to Watch and Code and start your journey there!

All screenshots were annotated using Shotty.


Regular Expressions

  • A concise way to look for patterns in strings.

  • Pretty much the same in all programming language.

  • Goal of this exercise is to get comfortable enough to figure out regular expressions when they come up.

How do we figure out which passwords are strong which ones are weak?

Grading passwords with regular expressions:

  • Number of total characters
  • Number of lowercase letters
  • Number of uppercase letters
  • Number of digits
  • Number of special characters
  • Number of "words" (consecutive letters)

Some basic regex examples to get started:

See this content in the original post

The last expression is not a sustainable way to detect all characters, since you'd have to do a lot of typing, and account for both uppercase and lowercase, other characters, spaces, etc.

Luckily there's a very concise way to do this in regex:

See this content in the original post

This matches our first requirement on our list:

  • Number of total characters
  • Number of lowercase letters
  • Number of uppercase letters
  • Number of digits
  • Number of special characters
  • Number of "words" (consecutive letters)

These shortcuts, such as using . to represent all characters, are called "meta characters".

Only use regular expressions when it's a lot easier than the alternative. 

Compare these:

See this content in the original post

Not the best use of regular expressions to get the length of this string, but a good introduction to how they work.

Let's look at the next one of our requirements:

Number of lowercase letters

See this content in the original post

But we can do better. We can use something called ranges. Ranges only work in sets (the square brackets, []). We express a range with a dash.

See this content in the original post

So that satisfies requirement #2 of counting the number of lowercase letters.

Without regular expressions this would be difficult.

See this content in the original post

Let's look at our next requirement:

Number of uppercase characters

We can use a range in the same way we did for lowercase letters, but we just need to change them to uppercase. Easy.

See this content in the original post

Number of digits

Again, we can use a range here to do this easily:

See this content in the original post

Number of special characters

(Anything that is not a letter or a digit).

Remember that we had this regular expression that gave us anything that is letter or a digit:

See this content in the original post

If we just had a way to say that we want anything that is not this expression, we'd be in good shape. Well lucky for us, that exists: the caret (^).

The way to interpret the caret is: match anything that is not in the set that follows.

See this content in the original post

Let's try it out:

See this content in the original post

So the new concept from this requirement is the negation operator (^) that we can use in sets. The way to use it is:

  • Create a set with []
  • Define the characters you don't want
  • Put the negation operator ^ in the front within the []

 

Number of "words" (consecutive letters)
 

Let's start by using what we know so far. We know we want to match letters, so that's a good starting point. We'll use a set with 2 ranges: lowercase and uppercase letters:

See this content in the original post

The problem with this regular expression is that it only matches single characters.

We want to match consecutive characters.

Luckily regular expressions have a concise way to look for consecutive characters. The feature is called quantifiers.

The way it works is: quantifiers are used to look for consecutive matches of any length. If you have 3 letters in a row it will match those 3 letters in a row. It will match as many as you want.

Let's say we're looking for letters that happen consecutively, 1 through 20 times. This should detect all the sets of consecutive letters ("words") in our string:

See this content in the original post

This is nice, but it's a bit of a hack. What if there is a word longer than 20 characters. We need a way to tell it to go to infinity. To do this, we simply leave out the upper limit:

See this content in the original post

Let's look at a few more things:

See this content in the original post

But there a couple of twists that are common:

Wanting to match one or more consecutive matches is so common, that there is a shortcut: the plus sign, +.

See this content in the original post

The second one is more common.

The last thing to note is that it's so common to want to find letters, it can be convenient to not need to specify two ranges.

We can just give one range, a-z or A-Z (it doesn't matter). It will match both if we add the case insensitive flag, i.

See this content in the original post

Summary

  • Regular expressions let us look for matches in strings.
  • Specify options using a concise syntax.
  • Allows us to do certain things much more easily than in Javascript.
  • Put the thing we want to match in between // in 'string'.match(/s/)
  • . will match any character: 'string'.match(/./)
  • The global flag, g, will give us multiple matches (find all): 'strings'.match(/s/g) // ["s", "s"]
  • The pipe | will look for a or b: 'abc'.match(/a|b/g).
  • Specify or by putting our matches in brackets: 'abc'.match(/[abc]/g)
  • Specify ranges with the - symbol: 'abc'.match(/[a-z]/g)
  • Combine ranges: 'abcDEF123'.match(/[a-zA-Z0-9]/g)
  • Use the negation operator, ^, to get things that don't match our specified values: 'abc123'.match(/[^a-z]/g) // [123]
  • Match consecutive characters rather than single characters, using quantifiers {}: 'string'.match(/[a-z]{1, 20}/g)
  • Leave of the upper limit to go to infinity
  • 'string'.match(/[a-z]{1,}/g)
  • Or just type: 'string'.match(/[a-z]+/g)
  • Match uppercase or lowercase by adding the case insensitive flag, i: 'string'.match(/[a-z]/gi)