chapter eleven

11 Regular expressions and regexp-based string operations

 

This chapter covers

  • Regular expression syntax
  • Pattern-matching operations
  • The MatchData class
  • Built-in methods based on pattern matching

In this chapter, we’ll explore Ruby’s facilities for pattern matching and text processing, centering around the use of regular expressions. A regular expression in Ruby serves the same purposes it does in other languages: it specifies a pattern of characters, a pattern that may or may not correctly predict (that is, match) a given string. Pattern-match operations are used for conditional branching (match/no match), pinpointing substrings (parts of a string that match parts of the pattern), and various text-filtering techniques.

Regular expressions in Ruby are objects, specifically instances of the Regexp class. As with all other objects in Ruby, you send messages to a regular expression.

We’ll start with an overview of regular expressions. From there, we’ll move on to the details of how to write them and, of course, how to use them. In the latter category, we’ll look at using regular expressions both in simple match operations and in methods where they play a role in a larger process, such as filtering a collection or repeatedly scanning a string.

11.1 What are regular expressions?

11.2 Writing regular expressions

11.2.1 Seeing patterns

11.2.2 Simple matching with literal regular expressions

11.3 Building a pattern in a regular expression

11.3.1 Literal characters in patterns

11.3.2 The dot wildcard character (.)

11.3.3 Character classes

11.3.4 The | (pipe) “or” operator

11.4 Matching, substring captures, and MatchData

11.4.1 Capturing submatches with parentheses

11.4.2 Match success and failure

11.4.3 Two ways of getting the captures

11.4.4 Other MatchData information

11.5 Fine-tuning regular expressions with quantifiers, anchors, and modifiers

11.5.1 Constraining matches with quantifiers

11.5.2 Greedy (and non-greedy) quantifiers

11.5.3 Regular expression anchors and assertions

11.5.4 Modifiers

11.6 Converting strings and regular expressions to each other

11.6.1 String-to-regexp idioms

11.6.2 Going from a regular expression to a string

11.7 Common methods that use regular expressions

11.7.1 String#scan

11.7.2 String#split

11.7.3 sub/sub! and gsub/gsub!

11.7.4 Case equality and grep

11.8 Summary