Scala Regular Expressions: Examples & Reference

Scala Regular Expressions: Examples & Reference

Last updated:

Here are a few ways in which you can use regular expressions (regexp for short) in Scala.

Find out whether or not a string fully matches a given regexp

//returns True

Find out whether a given string contains a given pattern

// you must check whether findAllIn is not an empty collection
"""\d+""".r.findAllIn("foo456bar").length != 0
// returns True

Find the first match of a regexp in a String

var numbers = """\d+""".r
var str = "foo 123 bar 456"

// prints Some(123)

Iterate over all matches of given regexp in a string

var numbers = """\d+""".r
var str = "foo 123 bar 456"

numbers.findAllMatchIn(str).foreach { println _ }
// prints 123
// 456

Return all regexp matches as a List

Just call toList:

var numbers = """\d+""".r
var str = "foo 123 bar 456"

//returns List(123,426)

Search and replace a regexp with a given value

use replaceAllIn to replace all occurrences of the regexp with the given string and replaceFirstInto replace just the first match:

var letters = """[a-zA-Z]+""".r
var str = "foo123bar"

// returns "YAY123YAY"

Search and replace regexp using a custom function

You can also perform substitutions with a custom function of type scala.util.matching.Regex.Match => String as the second parameter to function replaceAllIn:

var letters = """[a-zA-Z]+""".r
var str = "foo123bar"

letters.replaceAllIn(str, m => m.toString.toUpperCase)
// returns "FOO123BAR"

Replacing Captures

If you ever need to place placeholders for anything inside blocks of text, one of the strategies you can use is to choose uncommon sequences to insert in your text so that they can be easily parsed later on. One such strategy is to put identifiers with hashes (#) within your text and then parse them afterwards.

N.B.: group(0) returns the full match, group(1) returns the first capture (within parentheses), group(2) returns the second capture and so on.

var exp = """##(\d+)##""".r
var str = "foo##123##bar"

// using a "replacer" function that replaces the number found with double its value
exp.replaceAllIn(str, m => ( * 2 ).toString)  )
// returns "foo246bar"

Extract pattern captures into variables

// notice the r() method at the end
val pattern = """(\d{4})-([0-9]{2})""".r
val myString = "2016-02"

val pattern(year,month) = myString

Extract regexes using pattern matching

You can also use pattern matching to test a string against multiple regular expressions:

// these are the strings we want to check
val dateNoDay = "2016-08"
val dateWithDay = "2016-08-20"

// these are the patterns (note the starting capital letter) 
val YearAndMonth = """(\d{4})-([01][0-9])""".r
val YearMonthAndDay = """(\d{4})-([01][0-9])-([012][0-9])""".r

// this prints: "day provided: it is 20"
date2 match{
  case YearAndMonth(year,month) => println("no day provided")
  case YearMonthAndDay(year,month,day) => println("day provided: it is $day")

As with regular case classes, you will get a MatchError if you exhaust your options with matching anything:

// this won't match any patterns
val badString = "foo-bar-baz"

// scala.MatchError: foo-bar-baz (of class java.lang.String)
badString match{
  case YearAndMonth(year,month) => println("no day provided")
  case YearMonthAndDay(year,month,day) => println("day provided: it is $day")

Event if there are no capturing groups you must use empty parens

// groups starting with ?: are "non-capturing" groups
// so this pattern has no capturing groups
val Pattern = """^(?:foo|bar)\.baz""".r

// no matching groups but you must use empty parens
"""foo.baz""" match {
  case Pattern() => "ok" // MUST have parens at the end of Pattern
  case _ => "bleh"

Using modifiers (case-insensitive, unicode, etc)

You need to embed them into your expressions using the (?modifier) syntax. (?i) for case-insensitive, (?u) for unicode matches.

See all possible modifiers in the Java 7 Docs for Regular Expressions

// no match using case-sensitive pattern
val caseSensitivePattern = """foo\d+"""

// res0: Boolean = false

// the same pattern now matches if you use the case-insensitive modifier:
val caseInsensitivePattern = """(?i)foo\d+"""

//res1: Boolean = true


  • It's better to use """(triple double quotes) as delimiters for your expressions because you don't need to escape anything.
  • group(1), group(2) and so on can also be expressed as "$1","$2" and so on. Use whichever you think is clearer.


Dialogue & Discussion