Scala Regular Expressions: Examples & Reference

Scala Regular Expressions: Examples & Reference

Last updated:
Scala Regular Expressions: Examples & Reference
Source
Table of Contents

HEADS-UP

  • It's better to use """(triple double quotes) as delimiters for your expressions because you don't need to escape anything.

  • group(1), group(2) and so on can also be expressed as "$1","$2" and so on. Use whichever you think is clearer.

String matches regex

String must fully match regex

"5".matches("""\d""")
// True

String contains regex

"""pattern""".r.findAllIn("string").length != 0

// you must check whether findAllIn is not an empty collection
"""\d+""".r.findAllIn("foo456bar").length != 0
// True

Get first regex match in String

val numberPat = """\d+""".r
val str = "foo 123 bar 456"

println(numberPat.findFirstIn(str))
// prints Some(123)

Iterate over regex matches

val numberPat = """\d+""".r
val str = "foo 123 bar 456"

numberPat.findAllMatchIn(str).foreach { println _ }
// prints 123
// 456

Get matches as List

Just call toList:

val numberPat = """\d+""".r
val str = "foo 123 bar 456"

(numberPat.findAllMatchIn(str)).toList
// List(123,426)

Search and replace regex

use replaceAllIn to replace all occurrences of the regexp with the given string and replaceFirstInto replace just the first match:

val lettersPat = """[a-zA-Z]+""".r
val str = "foo123bar"

lettersPat.replaceAllIn(str,"YAY")
// "YAY123YAY"

Search and replace regex with custom function

You can also perform substitutions with a custom function of type scala.util.matching.Regex.Match => String as the second parameter to function replaceAllIn:

val lettersPat = """[a-zA-Z]+""".r
val str = "foo123bar"

lettersPat.replaceAllIn(str, m => m.toString.toUpperCase)
// "FOO123BAR"

Search and replace regex with captures

If you ever need to place placeholders for anything inside blocks of text, one of the strategies you can use is to choose uncommon sequences to insert in your text so that they can be easily parsed later on.

One such strategy is to put identifiers with hashes (#) within your text and then parse them afterwards.

N.B.: group(0) returns the full match, group(1) returns the first capture (within parentheses), group(2) returns the second capture and so on.

val pat = """##(\d+)##""".r
val str = "foo##123##bar"

// using a "replacer" function that replaces the number found with double its value
pat.replaceAllIn(str, m => (m.group(1).toInt * 2 ).toString)  )
// "foo246bar"

Extract capture into variable

// notice the r() method at the end
val pat = """(\d{4})-([0-9]{2})""".r
val myString = "2016-02"

val pat(year,month) = myString

Extract regexes with pattern matching

You can also use pattern matching to test a string against multiple regular expressions:

// these are the strings we want to check
val dateNoDay = "2016-08"
val dateWithDay = "2016-08-20"

// these are the patterns (note the starting capital letter)
val YearAndMonth = """(\d{4})-([01][0-9])""".r
val YearMonthAndDay = """(\d{4})-([01][0-9])-([012][0-9])""".r

// this prints: "day provided: it is 20"
dateWithDay match{
  case YearAndMonth(year,month) => println("no day provided")
  case YearMonthAndDay(year,month,day) => println(s"day provided: it is $day")
}

As with regular case classes, you will get a MatchError if you exhaust your options with matching anything:

// this won't match any patterns
val badString = "foo-bar-baz"

// scala.MatchError: foo-bar-baz (of class java.lang.String)
badString match{
  case YearAndMonth(year,month) => println("no day provided")
  case YearMonthAndDay(year,month,day) => println("day provided: it is $day")
}

Even if there are no capturing groups you must use empty parens

// groups starting with ?: are "non-capturing" groups
// so this pattern has no capturing groups
val Pattern = """^(?:foo|bar)\.baz""".r

// no matching groups but you must use empty parens
"""foo.baz""" match {
  case Pattern() => "ok" // MUST have parens at the end of Pattern
  case _ => "bleh"
}

Case-insensitive Regex

Prepend (?i) to patterns to make them case-insensitive.

See all possible modifiers in the Java 7 Docs for Regular Expressions

// no match using case-sensitive pattern
val caseSensitivePattern = """foo\d+"""

"Foo123".matches(caseSensitivePattern)
// res0: Boolean = false

// the same pattern now matches if you use the case-insensitive modifier:
val caseInsensitivePattern = """(?i)foo\d+"""

"Foo123".matches(caseInsensitivePattern)
//res1: Boolean = true

References

Dialogue & Discussion