Scala Regular Expressions: Examples & Reference
Last updated:- String matches regex
- String contains regex
- Get first regex match in String
- Iterate over regex matches
- Get matches as List
- Search and replace regex
- Search and replace regex with custom function
- Search and replace regex with captures
- Extract capture into variable
- Extract regexes with pattern matching
- Case-insensitive Regex
HEADS-UP
It's better to use
"""
(triple double quotes) as delimiters for your expressions because you don't need to escape anything.group(1)
,group(2)
and so on can also be expressed as"$1"
,"$2"
and so on. Use whichever you think is clearer.
String matches regex
String must fully match regex
"5".matches("""\d""")
// True
String contains regex
"""pattern""".r.findAllIn("string").length != 0
// you must check whether findAllIn is not an empty collection
"""\d+""".r.findAllIn("foo456bar").length != 0
// True
Get first regex match in String
val numberPat = """\d+""".r
val str = "foo 123 bar 456"
println(numberPat.findFirstIn(str))
// prints Some(123)
Iterate over regex matches
val numberPat = """\d+""".r
val str = "foo 123 bar 456"
numberPat.findAllMatchIn(str).foreach { println _ }
// prints 123
// 456
Get matches as List
Just call toList
:
val numberPat = """\d+""".r
val str = "foo 123 bar 456"
(numberPat.findAllMatchIn(str)).toList
// List(123,426)
Search and replace regex
use replaceAllIn
to replace all occurrences of the regexp with the given string and replaceFirstIn
to replace just the first match:
val lettersPat = """[a-zA-Z]+""".r
val str = "foo123bar"
lettersPat.replaceAllIn(str,"YAY")
// "YAY123YAY"
Search and replace regex with custom function
You can also perform substitutions with a custom function of type scala.util.matching.Regex.Match => String
as the second parameter to function replaceAllIn
:
val lettersPat = """[a-zA-Z]+""".r
val str = "foo123bar"
lettersPat.replaceAllIn(str, m => m.toString.toUpperCase)
// "FOO123BAR"
Search and replace regex with captures
If you ever need to place placeholders for anything inside blocks of text, one of the strategies you can use is to choose uncommon sequences to insert in your text so that they can be easily parsed later on.
One such strategy is to put identifiers with hashes (#) within your text and then parse them afterwards.
N.B.: group(0)
returns the full match, group(1)
returns the first capture (within parentheses), group(2)
returns the second capture and so on.
val pat = """##(\d+)##""".r
val str = "foo##123##bar"
// using a "replacer" function that replaces the number found with double its value
pat.replaceAllIn(str, m => (m.group(1).toInt * 2 ).toString) )
// "foo246bar"
Extract capture into variable
// notice the r() method at the end
val pat = """(\d{4})-([0-9]{2})""".r
val myString = "2016-02"
val pat(year,month) = myString
Extract regexes with pattern matching
You can also use pattern matching to test a string against multiple regular expressions:
// these are the strings we want to check
val dateNoDay = "2016-08"
val dateWithDay = "2016-08-20"
// these are the patterns (note the starting capital letter)
val YearAndMonth = """(\d{4})-([01][0-9])""".r
val YearMonthAndDay = """(\d{4})-([01][0-9])-([012][0-9])""".r
// this prints: "day provided: it is 20"
dateWithDay match{
case YearAndMonth(year,month) => println("no day provided")
case YearMonthAndDay(year,month,day) => println(s"day provided: it is $day")
}
As with regular case classes, you will get a MatchError
if you exhaust your options with matching anything:
// this won't match any patterns
val badString = "foo-bar-baz"
// scala.MatchError: foo-bar-baz (of class java.lang.String)
badString match{
case YearAndMonth(year,month) => println("no day provided")
case YearMonthAndDay(year,month,day) => println("day provided: it is $day")
}
Even if there are no capturing groups you must use empty parens
// groups starting with ?: are "non-capturing" groups
// so this pattern has no capturing groups
val Pattern = """^(?:foo|bar)\.baz""".r
// no matching groups but you must use empty parens
"""foo.baz""" match {
case Pattern() => "ok" // MUST have parens at the end of Pattern
case _ => "bleh"
}
Case-insensitive Regex
Prepend (?i)
to patterns to make them case-insensitive.
See all possible modifiers in the Java 7 Docs for Regular Expressions
// no match using case-sensitive pattern
val caseSensitivePattern = """foo\d+"""
"Foo123".matches(caseSensitivePattern)
// res0: Boolean = false
// the same pattern now matches if you use the case-insensitive modifier:
val caseInsensitivePattern = """(?i)foo\d+"""
"Foo123".matches(caseInsensitivePattern)
//res1: Boolean = true