Lua Magic Characters

Introduction

In Lua there are characters called magic characters. These characters allow you to do special actions when pattern matching.

Task [top]

Explain how to use magic characters for Lua pattern matching.

Implementation [top]

The magic characters are:

(   )   .   %   +   –   *   ?   [   ^   $

In addition to this, Lua uses the following character classes (you will notice that the magic character % is used here)

  • %a   letters
  • %c   control characters
  • %d   digits
  • %l   lowercase letters
  • %p   punctuation characters
  • %s   whitespace characters
  • %u   uppercase letters
  • %w   alphanumeric characters
  • %x   hexadecimal digits
  • %z   the character \000

How it works [top]

This is section explains what each of the magic characters does. It also explains how to work with sets of characters.

The magic characters:

  • “(   )”
    • Represents what is called a capture. This allows you to enclose sub-patterns in your patterns
  • “.”
    • Represents any single character
    • If you want the literal character then you have to escape it with the % character: %.
  • “%”
    • This is a special character which toggles the character classes
    • In order to use the pattern you must use %% as an input
  • “+”
    • Matches 1 or more repetitions of the class. This will always match the longest possible chain.
    • Example of Usage: %w+
  • “-“
    • Matches 0 or more repetitions of the class. This will always match the shortest possible chain
    • Example of Usage: %d-
  • “*”
    • Matches 0 or more repetitions of the class. This will always match the longest possible chain
    • Example Usage: %l*
  • “?”
    • Matches 0 or 1 occurrence of the class
    • Example Usage: %a?
  • “^”
    • This is only a magic character when it is at the beginning of a pattern.
    • When this is at the beginning of a pattern it forces the pattern to match the start of a string
    • Example Usage: ^A.+ This will match any set of characters which begin with the character A
  • “$”
    • This is only a magic character when it is at the beginning of a pattern.
    • When it is at the end of a pattern it forces the pattern to match the end of the string
    • Example Usage: %w%.$ will match any alphanumeric character which is followed immediately and only by a character

Fun with Sets:

  • The “[“ and “]” symbols are used to represent sets:
    • The “[“ character denotes the start of a set, and a “]” shows the end
    • A set is a class which is the union of all of the characters and/or classes which appear in the set
    • Example Usage: [%d%l] will match any digit or any lowercase letter
    • Example Usage: [%dabc] will match any digit or the characters a, b, or c
  • Sets can be modified with the “^” not character:
    • This will make the set match anything but the characters listed inside the brackets
    • Example Usage: [^%l] will match anything but a lowercase letter
  • Use the “-“ (dash) character to indicate a range of values:
    • Example Usage: [1-5] will match the values 1 through 5 inclusive

More information [top]