Patterns

From GMod Wiki

Revision as of 19:34, 13 October 2010 by Divran (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Icon-info.png Go to:
Useful Information
Lua: Patterns
Page white text.png Description:Shows people how to use lua patterns
link=User:Brian Nevec Original Author:Brian Nevec
Calendar.png Created:30th June, 2009

Contents

What's this article for?

This article is for teaching you how to use Lua's pattern matching language. The pattern matching language( or just patterns for short ) gives you somewhat advanced tools for searching and replacing recurring patterns in strings. These tools can be used for writing text data parsers, custom formatters and many other things that would take hundreds of lines of code.

A lot of the theory in this article is either copied or rewritten from the lua reference manual. You can see the manual section on patterns here

Getting started

An average pattern looks like this: [%w_]+. That specific pattern could be used for finding variable names( such as "hi_there", "h0w_are_you", etc. ). What each character in the pattern does I will go over later in this article.

These functions can be used together with patterns:

I will try to use all of these functions and explain how each of them work in detail.

Special characters

There are a bunch of special characters that either escape other characters or modify the pattern is some way. These characters are: "^$()%.[]*+-?". They can also be used in the pattern as normal characters by prefixing them with a "%" character, so "%%" becomes "%", "%[" becomes "[", etc.

Character classes

Character classes represent a set of characters. They can be either predefined sets or custom sets that can consist of the same predefined sets, ranges or any single characters.

Available character classes( custom and predefined ):

Repetition and anchoring

Characters in a string match a pattern in the following ways:

Patterns can be anchored like so:

These two characters only have a meaning if positioned as stated above. At any other position, these characters have no meaning and represent themselves.

Captures

Patterns can also contain sub-patterns enclosed in "()". Captures are used in functions like string.match and string.gsub to return or substitute a specific match from the pattern. Examples on how to use these can be found below.

Usage

Now I'm going to show you how to actually use all that above stuff. The examples below explain how to use the four functions listed above.

string.find

string.find( string str, string pattern, [number start, [boolean plain]] );

Str is the string to search, pattern is the pattern string to find, start is the start index and plain is a boolean indicating whether to use a pattern search or just plain text search. The function returns the start and end indices( not start index and length ) of the matching substring. If the pattern has captures, they will be returned after the indices. If a match couldn't be found, the function returns nil.

The following code will find the first word in the string.

 
local str = "1. Don't spam!";
local pattern = "([%a']+)"; -- will match a substring that has one or more letter or apostrophes( ' )
local start, endpos, word = string.find( str, pattern );
 
print( start, endpos, word );
 

Output: 4 8 Don't

You probably thing that could be done with string.Explode and a few loops, but look, we did it in three lines.

The following code will check if the string is safe to be used as a file name.

 
local str = "cry|*to";
local pattern = '[\\/:%*%?"<>|]'; -- a set of all restricted characters
local start = string.find( str, pattern );
 
print( "String is "..( ( start ~= nil ) and "unsafe" or "safe" ) );
 

Output: String is unsafe

string.match

string.match( string str, string pattern, [number start] );

Str is the string to search, pattern is the pattern to find and start is the start position. If a there is a match, the function return the captures from the pattern, if there are no captures, it will return the whole match. If a match couldn't be found, the function will return nil.

The following code will parse a simple keyvalue line.

 
local str = "key=  value";
local pattern = "([%w_]+)%s*=%s*([%w_]+)"; -- will match a "variable name, 0 or more spaces, equal, 0 or more spaces, variable name"
local key, val = string.match( str, pattern );
 
print( key, val );
 

Output: key value

The following code will check if the string ends with a .lua extension.

 
local str = "teel.lua";
local pattern = ".+%.lua$"; -- anything until a dot and "lua" at the end of the string
local match = string.match( str, pattern );
 
print( "String ends with "..( ( match ) and ".lua" or "something else" ) );
 

string.gmatch

string.gmatch( string str, string pattern );

Str is the string to search and pattern is the string to search for. The function returns an iterator function( special functions used by loops ) that goes through every match in the string and returns the pattern's captures, if there are any, or the whole match if there are no captures. The function will not return nil in the case where a match couldn't be found, but an 'empty' iterator function that will not start a loop.

The following code goes through every word in the string.

 
local str = "This is PATTERNS!";
local pattern = "[%a']+";
 
for word in string.gmatch( str, pattern ) do
 
	print( word );
 
end
 

Output: This is PATTERNS

Any pattern that you use in string.match can also be used in gmatch, but, instead of finding only the first match, it will find every match in the string.

The following code uses the keyvalue parsing pattern but can now read a list of keyvalues.

 
local str = "key = value; key2 =  value2";
local pattern = "([%w_]+)%s*=%s*([%w_]+)"; -- same pattern as above
 
local tbl = { };
for key, value in string.gmatch( str, pattern ) do
 
	tbl[ key ] = value;
 
end
 
table.foreach( tbl, print );
 

Output: key value key2 value2

The interesting thing is that the string can have any characters as separators between keyvalue pairs.

string.gsub

string.gsub( string str, string pattern, string/table/function repl );

String is the string to search in, pattern is the pattern to search for and repl is the value to replace with. The function returns str where all occurrences of pattern have been replaced with the value given by repl and, as the second argument, the total number of matches.

Repl can be the following things:

If the function or table returns nil or false, the match gets ignored and nothing gets replaced.

The following code formats a keyvalue pair as an xml node.

 
local str = "key = value";
local pattern = "([%w_]+)%s*=%s*([%w_]+)";
local replacement = "<%1>%2</%1>";
 
local output = string.gsub( str, pattern, replacement );
 
print( output );
 

Output: <key>value</key>

The following code creates a function that works like the .net formatting feature.

 
function string.format2( fmt, ... )
 
	// 'arg' is the ... combined in a table
	return fmt:gsub( "{(%d+)}", function( i ) return arg[ tostring( i ) + 1 ]; end );
 
end
 
local str = "This is {0}, oh {1}..";
local repl1 = "PATTERNS";
local repl2 = "YEAH";
 
local output = string.format2( str, repl1, repl2 );
 
print( output );
 

Output: This is PATTERNS, oh YEAH..

Conclusion

The article is finally over! I hope you learned something new from all of this. Lua's patterns are very powerful when used right. When making an addon that heavily relies on strings, patterns will most likely come in handy. You can find some new examples in either the lua manual or PIL.

Good day!

See also

Personal tools
Namespaces
Variants
Actions
Navigation
Lua Scripting
Functions
Hooks
Toolbox