Regular expressions php if. PHP Regular Expressions. Checking phone numbers

Regular expressions (abbreviated - regex) are sequences of characters that form search patterns. They are mainly used in string matching patterns.

Short story

  • It all began in the 1940s and 1960s, when many smart people talked about regular expressions;
  • 1970s g / re / p;
  • 1980 Perl and Henry Spencer;
  • 1997 PCRE (Perl Compatible Regular Expressions). That's when what we call regular expressions took off. PCRE provides libraries for almost every language.

Common use of regular expressions in PHP

PHP includes three main functions for working with PCRE - preg_match, preg_match_all and preg_replace.

Matching comparison

The expression returns 1 if a match was found, 0 if not, and false if an error occurs:

int preg_match (string $ pattern, string $ subject [, array & $ matches [, int $ flags \u003d 0 [, int $ offset \u003d 0]]])

A regular expression example that returns the number of matches found:

int preg_match_all (string $ pattern, string $ subject [, array & $ matches [, int $ flags \u003d PREG_PATTERN_ORDER [, int $ offset \u003d 0]]])

Replacement

The expression returns the replaced string or array ( based on the $ subject object):

mixed preg_replace (mixed $ pattern, mixed $ replacement, mixed $ subject [, int $ limit \u003d -1 [, int $ count]])

Common use of regular expressions in JavaScript

Regular expressions in JavaScript look almost the same as they do in PHP.

Matching comparison

Returns an array of matches, or null if no matches were found:

string.match (RegExp);

Replacement

A regular expression that returns a string with the replacements done:

string.replace (RegExp, replacement);

Features of regular expressions in JavaScript

  • The period never matches a new line:
  • The same methods for matching and replacing using a regular expression as without them.

Principles of Writing Regular Expression Patterns

Consider an example where you need to find addresses email in the code base. Our goal:

Analog sockets

Regular expressions are composed of two types of characters:

  • special symbols: ? * + () () ^ $ /.
  • Literals.

Think of the input lines as bolts and the template as a set of connectors for them (in the appropriate order).

Special symbols

When validating regular expressions, you need to know how special characters work:

  • The backslash character \\ can replace another special character in a regular expression:
  • Dot and w -.

Match all characters except newlines. If you want to check for a dot, and only a dot -, for letters, numbers and underscore - w

  • Square brackets .

Matches characters within parentheses. Supports ranges. Some examples:
o - matches any a, b or c.
o uppercase letters.
o any number.
o - matches any lowercase or uppercase letter character.
Optional? Matching 0 or 1.
Asterisk *.

An asterisk represents 0 or more characters.

Match 1 or more characters.

Curly braces ().

Minimum and maximum values. Some examples of regular expression syntax:
o (1,) at least 1.
o (1.3) from 1 to 3.
o (1.64) from 1 to 64.

Let's add all of this to get a regex for email addresses:

/[email protected]+ (. +) * / i


How it looks in PHP:

preg_match_all ("/ [email protected]+ (. +) * / i ", $ input_lines, $ output_array);

Using a regular expression for validation

Challenge: Ensure that the input is what we expect. Goal 1: / [^w$.// Target 2: / ^ (1,2) $ /

Regular expressions are good for finding items, but you need to know what exactly you are looking for.

When shouldn't you use a regular expression for validation?

Many cases are better handled with pHP functions filter_var. For example, email address validation should be done using PHP's built-in filters:

filter_var (" [email protected]", FILTER_VALIDATE_EMAIL)

Validation with regular expressions

Regular expressions at the end of a line use anchors:

^ - indicates the beginning of a line.
$ Is a dollar sign that indicates the end of the line.

if (! preg_match ("% ^ (1,2) $%", $ _POST ["subscription_frequency"])) ($ isError \u003d true;)

Excluded character classes

[^ abc] - everything except a, b or c, including newlines.

An example that provides input only for alphanumeric characters, dashes, periods, underscores:

if (preg_match ("/ [^ 0-9a-z -_.] / i", $ productCode)) ($ isError \u003d true;)

Find and replace

The most common PCRE functions for performing find and replace are preg_replace () and preg_replace_callback (). But there are also preg_filter () and preg_replace_callback_array (), which do much the same thing. Please note that the preg_replace_callback_array () function is available since PHP7.

Replace words in the list

$ subject \u003d "I want to eat some apples."; echo preg_replace ("/ apple | banana | orange /", "fruit", $ subject);

Result

I want to eat some fruits.

If the regular expression contains subpatterns ( in parentheses), you can replace $ N or N (where N is an integer\u003e \u003d 1), this is called a "backlink".

Permutation of two numbers

$ subject \u003d "7/11"; echo preg_replace ("/ (d +) / (d +) /", "$ 2 / $ 1", $ subject);

Result

Change date formatting

$ subject \u003d "2001-09-11"; echo preg_replace ("/ (d +) - (d +) - (d +) /", "$ 3 / $ 2 / $ 1", $ subject);

Result

Simple example of replacing url in tag

$ subject \u003d "Please visit https://php.earth/doc for more articles."; echo preg_replace ("# (https?: // ([^ s. /] + (?:. [^ s. /] +) * [^ s] *)) #i", "$ 2", $ subject) ;

Result

Please visit php.earth/doc for more articles.

Sometimes it is necessary to perform complex search and replace, for example, when filtering / checking before replacing. Preg_replace_callback () can come in handy in this situation.

The regex in the previous example can only replace URLs starting with http or https. But now we also need to replace the URLs starting with www. Someone would think that you can just change https? : // in subpattern. For example, on ( ?: Https? : // | www.), But this won't work in most browsers because they will interpret www.domain as a relative path.

Therefore, in the regular expression constructor, before replacing, you need to do some things by adding http: // if the URL starts with www.

function add_protocol_if_begins_with_www ($ matches) ($ url \u003d strtolower ($ matches) \u003d\u003d\u003d "www."? "http: //". $ matches: $ matches; return "($ matches)";) $ subject \u003d "Please visit www.php.earth/doc for more articles. "; echo preg_replace_callback ("# (https?: // | www.) ([^ s. /] + (?\u003e. [^ s. /] +) * [^ s] *) #i", "add_protocol_if_begins_with_www", $ subject);

Result

Regular expressions are a very powerful tool for manipulating text substrings. In addition, regular expressions are also very difficult to learn and use.

There are several different dialects of regular expressions, among which one of the most common and developed is the syntax Perl-compatible regular expressions ( PCRE - Perl Compatible Regular Expressions).

In simple wordsA regular expression is a pattern that is applied to the given text from left to right. You can use regular characters, which retain their meaning in the pattern and mean matching the corresponding characters. For example, a regular expression containing the text " comp"matches a string that contains the specified substring, for example" computer".

Setting the boundaries of a regular expression can be written like this:
"/ comp /" forward slash ( / ) at the beginning and end of the character set serves as the boundary of the regular expression, that is, the regular expression will be valid until the second forward slash character is encountered.

It is acceptable to use pattern modifier instructions that affect the entire regular expression. For example, the modifier " i"will search by regular expression, case insensitive. For Russian characters in encoding UTF8, for correct processing it is necessary to add the modifier " u" (PCRE_UTF8). For example:
"/ comp / ui" The regular expression from the example will match as a string " computer"and" COMPUTER".

To bind a regular expression to the beginning of a word, use the " ^ " (caret - insert sign):
"/ ^ light /" This expression will match the string " lamp"and will not match the word" dawn".

Dollar sign " $ "means the end of the line:
"/ ^ light $ /" This regular expression matches only the string " lamp", where there is no other text after the search word.

The following regular expression matches an empty string:
"/ ^ $ /" Very often in search bar contains the start and end character of the regular expression, in our case the forward slash character " / ". In this case, you must escape the given character with a backslash character ( \ ):
"/ ^ lamp \\ / ceiling $ /" In this example, the regular expression will match the string ceiling lighting".

Any other character can be used as a separator, for example " | ":
"| ^ lamp \\ / ceiling $ | ui" It is necessary to change the separators when leaving the search task, for example, if the forward slash character " / "occurs often in the search bar, you can change it.

You should be very careful when using some of the delimiter characters as they can fulfill their role in the pattern. Using the pipe character " | "in a regular expression can be used to specify alternative masks:
"/ ^ abc | def $ /" This regular expression matches any string containing substrings " abc" or " def". The vertical bar is most commonly used when checking, for example, file extensions or domain name zones.

Substrings in regular expressions can be grouped using parentheses " () ":
"/ ^ color (red | blue | green) $ /" This regex will match a string like " color red"but instead of" red"can be like" blue"and" green".

To use parentheses as part of the search string, they must be escaped. For example, match the string " color (red)"will be the following regular expression:
"/ ^ color \\ (red \\) $ /" In addition to grouping characters, parentheses have one more purpose. All expressions found in parentheses are stored by the interpreter and can be accessed when replacing or searching by parenthesis number.

To specify a character class, you must use the square brackets "". They limit the search to those characters that are enclosed in them:
"//" This regular expression will match a substring that contains at least one character from " abc".

To create a regular expression that matches all the letters of the English alphabet, you can list all the letters in the regular expression, or you can write it shorter like this:
"// i" Any two characters, separated by a hyphen, match the range of characters in between. This regular expression describes lowercase characters, but the modifier " i"performs a case sensitive search.

Regular expressions matching a digit are set in a similar way:
"//" When using backslash escaping, some characters have special interpretation:

\\ d - any decimal digit ( );

\\ D - any character other than a decimal digit;

\\ s - any whitespace character ( [\\ r \\ n \\ t \\ f]);

\\ S - any non-whitespace character;

\\ w - any character that forms a "word" ( );

\\ W - any character that does not form a "word";

\\ t - tabulation character;

\\ n - line feed character;

\\ - backslash character ( \ );

\. - dot symbol ( . ).

Dot symbol " . "denotes any character in a regular expression other than line break characters" \\ r" or " \\ n", so you must escape this character to find a point.

The regular expression for a number can be written like this:
"/ [\\ d] /" To exclude a character class from the search, you must put the first character in square brackets " ^ ", which no longer acts as a line boundary pointer, but as a negation:
"/ [^ 0-9] /" This regular expression matches any character that is not in the range " 0-9 ".

List of special characters (metacharacters):
\\ ^ $. | ()? * + () The expression in square brackets is often used in conjunction with the so-called quantifierswhich are symbols " ? ", "+ "and" * ". Quantifiers immediately follow a character and change the number of occurrences of a particular character in a string:

? - the character either enters the string once, or does not enter it at all;

* - any number of occurrences of a character in a string, including 0;

+ - one or more occurrences of a character in a string.

For example, if you need to find a substring containing one or more digits, you should use an expression like:
"/ [\\ d] + /" Symbol " * "is used for any number of occurrences of a string in a substring, that is, the following regular expression matches either an empty string or a string containing an unlimited number of digits.
"/ ^ [\\ d] * $ /" Regular expressions also use curly braces ( {} ), which are designed to indicate the number or range of numbers to repeat an element:

"ab (2)"- matches the string" abb";

"ab (2,)b"there should be at least two" b";

"ab (2,4)"- matches the line with" b"followed by 2 to 4 characters" b".

Expression " {0,} "completely similar" * ", and " {1,} " - "+ ". Expression " {0,1} "can be written more concisely using" ? ".

To combine characters into a sequence, they must be placed in parentheses. For example, the following regular expression matches a string with " a"followed by 2 to 4 sequences" bc";
"a (bc) (2,4) /" Modifier exists Uwhich inverts greed. For example, the expression <.*> matches a string containing multiple HTML markup tags in its entirety. Greed can be used to highlight individual tags: <.*?> or <.*>/ U.

The greed of quantifiers can be a significant problem. For example, it is often expected that the expression <.*> will find HTML tags in the text. However, if the text contains more than one HTML tag, then this expression matches the entire string containing many tags.

Functions for working with regular expressions

After reading the theoretical foundations, it's time to move on to practical. There are several functions for working with regular expressions. You can read more about each of them on the page:.

First, consider the Preg_match function, which searches a string using a regular expression and has the following syntax:
int preg_match (string $ pattern, string $ subject [, array & $ matches [, int $ flags \u003d 0 [, int $ offset \u003d 0]]]) Function Preg_match searches in the given text Subject pattern matches Pattern... If an optional parameter is given Matches, then the search results are placed in an array. Element $ matches will contain the part of the string that matches the occurrence of the entire pattern, $ matches [i] - part of the string matching the first parentheses, $ matches - the second, etc.

Optional parameter Flags can take a single meaning PREG_OFFSET_CAPTURE, when specified, the format of the returned array changes $ matches - each occurrence is returned as an array, the zero element of which contains the found substring, and the first contains the offset. The search is carried out from left to right, from the beginning of the line.

Function Preg_match returns the number of matches found, which can only take 2 values \u200b\u200b- 0 (no matches found) and 1 , insofar as this function terminates after the first match found.

To find all matches, use the Preg_match_all function, which has the following syntax:
int preg_match_all (string $ pattern, string $ subject [, array & $ matches [, int $ flags \u003d PREG_PATTERN_ORDER [, int $ offset \u003d 0]]]) Function Preg_match_all searches in string Subject all pattern matches Pattern and puts the result into an array Matches in the order determined by the combination of flags Flags... As in the previous function, you can set the offset Offset, starting from which the line search will be performed Subject... After finding the first match, subsequent searches will be carried out not from the beginning of the line, but from the end of the last found occurrence.

Let's move on to the function that, in addition to searching, also performs replacement by regular expression - Preg_replace:
mixed preg_replace (mixed $ pattern, mixed $ replacement, mixed $ subject [, int $ limit \u003d -1 [, int & $ count]]) Function Preg_replace searches for matches in a string Subject with template Pattern and replaces them with Replacement.

Preg_split function splits a string by regular expression.
array preg_split (string $ pattern, string $ subject [, int $ limit \u003d -1 [, int $ flags \u003d 0]]) The function returns an array consisting of substrings of the given string Subjectthat is broken down to match the pattern Pattern.

In most cases, the use of the above functions is sufficient for solving many problems.

There are also additional template constructs:

(?#a comment) - a comment in the body of the template. Sometimes it is very useful to place a specific comment in the body of the regexp to better understand how it works.

(?:template) - grouping like " () ", but without a backlink. This grouping is very useful for specifying a template but without creating a backlink.

(? \u003d pattern) - "looking" ahead. This construction may be needed to search for a pattern with a pre-specified expression, for example, the expression " / \\ w + (? \u003d \\ t) /"matches a word followed by a tab, but the" \\ t"is not included in the result.

Now let's describe the most frequently used examples of using regular expressions:

Checking the correctness of the input Email:
preg_match ("/ ^ [email protected]+ \\. (1,6) $ / ui ", $ email) Before the dog character, the pattern looks for letters and numbers, a dash, an underscore, and
specks one or more number of occurrences starting from the beginning of the line:
^ + This is followed by the second part of the mailing address, starting with doggy having the same character set as the first part:
@ + After that, we check the domain zone, which consists exclusively of a string of letters of a certain number of characters to the end of the string:
\\. (1,6) $ We can also select all Email from the text:
$ text \u003d "Here is the text and mailing address [email protected] and also one more address [email protected]";
preg_match_all ("/ [email protected]+ \\. (1,6) / ui ", $ text, $ matches, PREG_PATTERN_ORDER);
foreach ($ matches as $ key \u003d\u003e $ val) (
$ email \u003d filter_var ($ val, FILTER_VALIDATE_EMAIL);
if ($ email) $ output \u003d $ email;
) Unlike checking the correctness of the input Email, when selecting, we removed the start character ( ^ ) and end ( $ ) lines. Result this example:
Array
=> [email protected]
=> [email protected]
) Checking if the name is entered correctly:
preg_match ("# ^ [а-яґїієa-z \\ - \\ _ \\". \\ d \\ s] + $ # ui ", $ name); Checking if the number is entered correctly:
preg_match ("/ (+) / ui", $ id) Correct date input:
$ date \u003d "2017.05.25";
preg_match ("/ ^ (4). (2). (2) $ / ui", $ date); Remove all style definitions Style:
preg_replace ("/ style \u003d \\" [^ \\ "] * \\" / "," ", $ string); Since styles can be located inside almost any tag, the example removes only the style definition itself without the tag.

Remove all definitions of inline elements of the document Span:
preg_replace ("# ] *?\u003e # is "," ", $ string);
preg_replace ("#<\/span>#is "," ", $ table); Similarly, you can remove any tag, for example, for the title H1:
preg_replace ("# ] *?\u003e # is "," ", $ table);
preg_replace ("#<\/h1>#is "," ", $ table); You can clean up tables using PHP regular expressions like this:
// Remove everything from Table attributes:
$ table \u003d preg_replace ("# #siU ","

", $ table);
// Remove everything from TR attributes:
$ table \u003d preg_replace ("# #siU "," ", $ table);
// Remove everything from the TD attributes (except colspan or rowspan):
$ table \u003d preg_replace ("# ] + ((colspan | rowspan) \u003d [^ \\ s\u003e] +?) (|. *)\u003e # siU ","
", $ table); You can check the correctness of the file name using the following regular expression:
preg_match ("/ (^ + (*)) $ /", $ filename) Cut all images in text:
preg_replace ("/ / "," ", $ content) Find all links:
preg_match_all ("# ] * href \u003d "(. *)" [^\u003e] *\u003e # Ui ", $ content, $ url); Imagine a situation where the user does not use a space character after a period or comma. In this case, it turns out to be a very large word, which cannot always fit into the required field, which provokes horizontal scrolling.To avoid this, you can use the following regular expression, which will add a space after a period or comma:
preg_replace ("/ (\\. | \\,) ([^ \\ s]) / ui", "$ 1 $ 2", $ content) Find all hashtags ( #tag) can be done like this:
preg_match_all ("/ \\ # (\\ w + [^ \\ s] *) / ui", $ text, $ matches, PREG_PATTERN_ORDER); Or manually add the necessary characters and their number allowed for the compilation of hashtags:
preg_match_all ("/ \\ # ((1.50)) / ui", $ text, $ matches, PREG_PATTERN_ORDER);

In today's article, we will look at regular expressions in PHP, as well as see practical examples of using regular expressions in PHP scripts.

PHP Regular Expression Basics

In the early days of regular expressions, they were given the task of helping with strings on Unix systems. Later they began to be actively used not only in other systems, but also in different programming languages.

In PHP, regular expressions are used to parse text according to a specific pattern. Using regular expressions, you can easily match a pattern to the text you want in a string and replace it if needed, or just check for the presence of such text.

Regular expression types

There are 2 types of regular expressions:

  • Perl compatible
  • POSIX extended

Perl compatible functions are preg_match, preg_replace, and POSIX versions are ereg, eregi. Please note that the latter functions were deprecated in PHP 5.3.0 and were removed in. Therefore, we will only use Perl compatible functions. It is important to know that when using Perl-compatible regular expressions, the expression must be delimited, such as a forward slash (/).

Basic regular expression syntax in PHP

To use regular expressions, you first need to learn the pattern syntax. We can group characters within a template like this:

  • Regular characters that follow one after the other, for example hello
  • Start and end indicators as ^ and $
  • Counting indicators such as +, *,?
  • Logical operators such as |
  • Grouping operators such as (), (),

An example of a regular expression pattern for validating an email address looks like this:

PHP code for validating email using Perl-compatible regular expression looks like this:

Now let's take a look at a detailed breakdown of the pattern syntax for a regular expression:

Regular expression (pattern) Checking (object) Test fails (object) A comment
world Hello world Hello Ivan Passes if the pattern is present anywhere in the object
^ world world class Hello world Passes if pattern is present at the beginning of the object
world $ Hello world world class Passes if pattern is present at the end of the object
world / i This WoRLd Hello Ivan Searches in case insensitive mode
^ world $ world Hello world The line contains only "world"
world * worl, world, worlddd wor There is 0 or more "d" after "worl"
world + world, worlddd worl There is at least one "d" after "worl"
world? worl, world, worly wor, wory There is a 0 or 1 "d" after "worl"
world (1) world worly There is one "d" after "worl"
world (1,) world, worlddd worly There is one or more "d" after "worl"
world (2,3) worldd, worlddd world There is 2 or 3 "d" after "worl"
wo (rld) * wo, world, worldold wa There is 0 or more "rld" after "wo"
earth | world earth, world sun The string contains "earth" or "world"
w.rld world, wwrld wrld Contains any character instead of a dot
^.{5}$ world, earth sun The string contains exactly 5 characters
abc, bbaccc sun The line contains "a" or "b" or "c"
world WORLD There are any lowercase letters in the string
world, WORLD, Worl12 123 There are any lowercase or uppercase letters in the string
[^ wW] earth w, W Actual character cannot be "w" or "W"

Now let's move on to a more complex regular expression with a detailed explanation.

Practical examples of complex regular expressions

Now that you know the theory and basic syntax of regular expressions in PHP, it's time to create and analyze some more complex examples.

1) Checking username with regex
Let's start by checking the username. If you have a registration form, you will need to check the correct usernames. Suppose you don't want the name to contain any special characters other than "_.-" and of course the name must contain letters and possibly numbers. In addition, you may need to control the length of the username, for example 4 to 20 characters.

First, we need to identify the available symbols. This can be accomplished with the following code:

After that, we need to limit the number of characters with the following code:

Now let's put this regex together:

^{4,20}$

In the case of a Perl-compatible regular expression, enclose it with '/'. The resulting PHP code looks like this:

2) Checking hexadecimal color code by regular expression
The hexadecimal color code looks like this: # 5A332C, it is also acceptable to use a short form, for example # C5F. In both cases, the color code starts with # and then exactly 3 or 6 numbers or letters from a before f.

So, we check the beginning of the code:

^#

Then we check the range of valid characters:

After that, we check the allowable code length (it can be either 3 or 6). The complete regex code will come out as follows:

^#(({3}$)|({6}$))

Here we use a boolean operator to first check the code like # 123 and then the code like # 123456. The final PHP code for regular expression validation looks like this:

3) Validate customer email using regular expression
Now let's see how we can validate an email address using regular expressions. First, take a close look at the following examples of mail addresses:

[email protected] [email protected] [email protected]

As we can see, the @ symbol is a required element in the email address. In addition, there must be some kind of character set before and after this element. More precisely, it must be followed by a valid domain name.

Thus, the first part must be a string with letters, numbers or some special characters like _-. ... In the template, we can write it like this:

^+

A domain name always has, say, a name and a tld ( top-level domain) - that is, the domain zone. The domain zone is.com, .ua, .info and the like. This means that the domain regex pattern will look like this:

+\.{2,5}$

Now, if we pile everything together, we get a complete regex pattern for validating an email address:

^[email protected]+\.{2,5}$

In PHP code, this check will look like this:

We hope that today's article helped you get started with regular expressions in PHP, and that practical examples will come in handy when using regular expressions in your own PHP scripts.

This article provides a selection of php regexp examples. A very nice and useful collection of examples of regular expressions. All regex examples are PHP acceptable. Use it to your health!

Domain name verification example

This php snippet checks if the string is a valid domain name.

?:. *) +):? (d +)? /? / i ", $ url)) (echo" Your url is ok. ";) else (echo" Wrong url. ";)

Example of word highlighting in text

A very useful regular expression for finding and highlighting the desired word in the text. The code is especially useful when generating search results output.

$ text \u003d "Sample sentence from KomunitasWeb, regex has become popular in web programming. Now we learn regex. According to wikipedia, Regular expressions (abbreviated as regex or regexp, with plural forms regexes, regexps, or regexen) are written in a formal language that can be interpreted by a regular expression processor "; $ text \u003d preg_replace ("/ b (regex) b / i", " 1", $ text); echo $ text;

An example of how to highlight search results forWordPress

Open the search.php file and find the the_title () function. Replace it with the following line:

Echo $ title;

Now, before the replaced line, paste this code:

\0", $ title);?\u003e

Save your search.php file and open style.css. Add the following line to it:

Strong.search-excerpt (background: yellow;)

An example of getting images fromHTML regexp method

This piece of php code using regular expressions, searches all images and url to them.

$ images \u003d array (); preg_match_all ("/ (img | src) \u003d (" | ") [^" "\u003e] + / i", $ data, $ media); unset ($ data); $ data \u003d preg_replace ("/ (img | src) (" | "| \u003d" | \u003d ") (. *) / i", "$ 3", $ media); foreach ($ data as $ url) ($ info \u003d pathinfo ($ url); if (isset ($ info ["extension"])) (if (($ info ["extension"] \u003d\u003d "jpg") || ($ info ["extension"] \u003d\u003d "jpeg") || ($ info ["extension"] \u003d\u003d "gif") || ($ info ["extension"] \u003d\u003d "png")) array_push ($ images, $ url);))

Remove duplicate words (case insensitive)

Are there often words that are repeated? Then an example of this regular expression will be useful to you.

$ text \u003d preg_replace ("/ s (w + s) 1 / i", "$ 1", $ text);

Removing Duplicate Points

The same, only with repeating points.

$ text \u003d preg_replace ("/.+/ i", ".", $ text);

XML / HTML tag matching

This simple function takes two arguments: the tag (which you want to match), the xml, or the html code.

Function get_tag ($ tag, $ xml) ($ tag \u003d preg_quote ($ tag); preg_match_all ("(<".$tag."[^>]*>(.*?). ")", $ xml, $ matches, PREG_PATTERN_ORDER); return $ matches; )

Search for XHTML / XML tags with specific attribute values

This example is similar to the previous function, only you can significantly expand your search for example find

.

Function get_tag ($ attr, $ value, $ xml, $ tag \u003d null) (if (is_null ($ tag)) $ tag \u003d "\\ w +"; else $ tag \u003d preg_quote ($ tag); $ attr \u003d preg_quote ($ attr); $ value \u003d preg_quote ($ value); $ tag_regex \u003d "/<(".$tag.")[^>] * $ attr \\ s * \u003d \\ s * "." (["\\"]) $ value \\\\ 2 [^\u003e] *\u003e (. *?)<\/\\1>/ "preg_match_all ($ tag_regex, $ xml, $ matches, PREG_PATTERN_ORDER); return $ matches;)

Finding Hexadecimal Color Values

A great example of a regular expression that matches hexadecimal color values \u200b\u200bin given strings. What is this for? Maybe you want to write a CSS compression service or something similar.

$ string \u003d "# 555555"; if (preg_match ("/ ^ # (? :(? :( 3)) (1,2)) $ / i", $ string)) (echo "example 6 successful.";)

Search exampletitle on a given page

This interesting PHP example with regexp searches and returns text between tags and.

Feof ($ fp)) ($ page. \u003d Fgets ($ fp, 4096);) $ titre \u003d eregi (" (.*)", $ page, $ regs); echo $ regs; fclose ($ fp);

Parsing the Apache log

Most of the sites run on well-known Apache servers. If your site also runs on it, then you can parse the server log using php regexp.

// Logs: Apache web server // Successful hits to HTML files only. Useful for counting the number of page views. "^ ((? # client IP or domain name) S +) s + ((? # basic authentication) S + s + S +) s + [((? # date and time) [^]] +)] s +" (?: GET | POST | HEAD) ((? #File) / [^? "] + ?. html?) ?? ((? # Parameters) [^?"] +)? HTTP / + "s + (? # Status code) 200s + ((? # Bytes transferred) [- 0-9] +) s +" ((? # Referrer) [^ "] *)" s + "((? # User agent ) [^ "] *)" $ "// Logs: Apache web server // 404 errors only" ^ ((? # Client IP or domain name) S +) s + ((? # Basic authentication) S + s + S +) s + [((? # date and time) [^]] +)] s + "(?: GET | POST | HEAD) ((? #file) [^?"] +) ?? ((? # parameters) [ ^? "] +)? HTTP / + "s + (? # Status code) 404s + ((? # Bytes transferred) [- 0-9] +) s +" ((? # Referrer) [^ "] *)" s + "((? # User agent ) [^ "] *)" $ "

Example for checking password complexity

A great example of a regular expression that checks the difficulty level of a password. The password must be 6 characters long and contain at least one uppercase character, one lowercase character, or a number.

"A (? \u003d [-_ a-zA-Z0-9] *?) (? \u003d [-_ a-zA-Z0-9] *?) (? \u003d [-_ a-zA-Z0-9] *?) [-_a-zA-Z0-9] (6,) z "

Replacing text emoticons with graphic emoticons

This code example will change the text smiley to your graphic one. An interesting and useful php snippet.

$ texte \u003d "A text with a smiley :-)"; echo str_replace (":-)", " ", $ texte);

An example of a regular expression to get images fromhtml code

It is worth saying that this php code is used in wordpress to search and process images.

post_content; $ szSearchPattern \u003d "~ ] * /\u003e ~ "; // Run preg_match_all to grab all the images and save the results in $ aPics preg_match_all ($ szSearchPattern, $ szPostContent, $ aPics); // Check to see if we have at least 1 image $ iNumberOfPics \u003d count ($ aPics); if ($ iNumberOfPics\u003e 0) (// Here you can process your images // In this example, they will just be displayed on the monitor for ($ i \u003d 0; $ i< $iNumberOfPics ; $i++) { echo $aPics[$i]; }; }; endwhile; endif; ?>

I hope you found the collection of php regexp examples helpful. If there are interesting additions or examples of regular expressions (php), write in the comments.

Regular expressions are special patterns for finding a substring in text. With their help, you can solve the following tasks in one line: “check if the string contains numbers”, “find all email addresses in the text”, “replace several consecutive question marks with one”.

Let's start with one popular programming wisdom:

Some people, when faced with a problem, think: "Yeah, I'm smart, I'll solve it with regular expressions." Now they have two problems.

Sample templates

Let's start with a couple of simple examples. The first expression in the picture below looks for a sequence of 3 letters, where the first letter is "k", the second is any Russian letter, and the third is a case-insensitive "t" (for example, "cat" or "CAT" matches this pattern). The second expression looks for the time in the text in the format 12:34.

Any expression begins with a delimiter character. The symbol / is usually used as it, but you can also use other symbols that do not have special meaning in regular patterns, for example, ~, # or @. Alternative delimiters are used when the / can appear in the expression. Then comes the actual pattern of the string we are looking for, followed by a second delimiter, and at the end there may be one or more flag letters. They provide additional options when searching for text. Here are some examples of flags:

  • i - says that the search should be case-insensitive (case-sensitive by default)
  • u - says that the expression and the text being searched are using utf-8 encoding, not just Latin letters. Without it, the search for Russian (and any other non-Latin) characters may not work correctly, so you should always use it.

The template itself consists of regular symbols and special constructs. Well, for example, the letter "k" in regular lines stands for itself, but the symbols mean "in this place there can be any number from 0 to 5". Here is a complete list of special characters (in the php manual they are called metacharacters), and all other characters in the regex are normal:

Below we will analyze the meaning of each of these symbols (and also explain why the letter "ё" is rendered separately in the first expression), but for now let's try to apply our regular expressions to the text and see what happens. Php has a special function preg_match ($ regexp, $ text, $ match) that accepts a regular pattern, a text and an empty array as input. It checks if the text contains a substring that matches the given pattern and returns 0 if not, or 1 if there is one. And in the passed array, the first found match with the regex is put into the element with index 0. Let's write a simple program that applies regular expressions to different strings:

After getting acquainted with the example, we will study regular expressions in more detail.

Parentheses in regular expressions

Let's recap what the different kinds of brackets mean:

  • The curly braces a (1,5) specify the number of repetitions of the previous character - in this example, the expression searches for 1 to 5 consecutive letters "a"
  • The square brackets mean "any of these characters", in this case the letters a, b, c, x, y, z or a number from 0 to 5. Other special characters such as | or * - they represent a regular character. If the symbol ^ is in the beginning of the square brackets, then the meaning is reversed: "any one character, except the specified ones" - for example [^ a-c] means "any one character except a, b or c".
  • Parentheses group characters and expressions. For example, in the expression abc + the plus sign refers only to the letter c and this expression looks for words like abc, abcc, abccc. And if you put the brackets a (bc) + then the quantifier plus refers to the sequence bc and the expression looks for the words abc, abcbc, abcbcbc

Note: ranges of characters can be specified in square brackets, but remember that the Russian letter ё goes separately from the alphabet and to write "any Russian letter", you must write [a-yayo].

Bexsles

If you've looked at other regex tutorials, you've probably noticed that backslash is written differently everywhere. Somewhere they write one backslash: \\ d, but here in the examples it is repeated 2 times: \\\\ d. Why?

The regular expression language requires you to write a backslash once. However, in single and double quoted strings in PHP, backslash also has a special meaning: the tutorial on strings. Well, for example, if you write $ x \u003d "\\ $"; then PHP will take it as a special combination and only insert the $ character into the string (and the regex engine won't know about the backslash before it). To insert the sequence \\ $ into a line, we must double the backslash and write the code as $ x \u003d "\\\\ $"; ...

For this reason, in some cases (where a character sequence has a special meaning in PHP) we are required to double the backslash:

  • To write \\ $ in the regex, we write in the code "\\\\ $"
  • To write \\\\ in regex, we double each backslash and write "\\\\\\\\"
  • To write a backslash and a number (\\ 1) in a regular pattern, double the backslash: "\\\\ 1"

In other cases, one or two backslashes will give the same result: "\\\\ d" and "\\ d" will insert a pair of \\ d characters into the string - in the first case 2 backslashes is a sequence for inserting a backslash, in the second case there is no special sequence and characters will be inserted as is. You can check what characters are inserted into the string and what the regex engine sees with echo: echo "\\ $"; ... Yes, it's difficult, but what can you do?

Special constructions in regulars

  • \\ d looks for any one digit, \\ D - any one character except a digit
  • \\ w matches any single letter (any alphabet), digit, or underscore _. \\ W matches any character other than letters, numbers, underscores.

Also, there is a convenient condition for indicating a word boundary: \\ b. This construction means that on one side of it there must be a character that is a letter / number / underscore (\\ w), and on the other side it is not. Well, for example, we want to find the word "cat" in the text. If we write the regex / cat / ui, then it will find the sequence of these letters anywhere - for example, inside the word "cattle". This is clearly not what we wanted. If we add the word boundary condition to the regular square: / \\ bcat \\ b / ui, then now only the separate word "cat" will be searched.

Manual

  • PHP Regular Expression Syntax, Detailed Description
Programs and games
school38vrn.ru - Operating systems. Laptops. Programs. Computer. Technology overview. Service