# How to use regular expressions (RegEx)

Created: 26.01.2023&#x20;

Updated: 06.04.2023&#x20;

Author: Polina A.

{% hint style="success" %}
Regular expressions (also known as "<mark style="color:red;">regexp</mark>" or "<mark style="color:red;">regex</mark>") is a formal language used for working with text, searching for substrings within it and performing manipulations on them. [Metacharacters](https://en.wikipedia.org/wiki/Regular_expression) are special symbols used when writing regular expressions.

**When working with the Gravity Field system, regular expressions can be used in some of the targeting conditions in campaigns and in audiences.** A search is performed using a pattern string or "mask," consisting of characters and metacharacters, which define the search rule.

Before applying regular expressions to targeting conditions, you can check their accuracy using the website [**https://regex101.com/**](https://regex101.com/).
{% endhint %}

### **Example of a Regular Expression**

There are various link formats:

1. <http://site.com>
2. <https://site.com/>
3. <http://site.com/page/>
4. site.com/page.html

Using the regular expression <mark style="color:red;background-color:yellow;">**`(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\.-]*)*\/?)`**</mark>, **all possible** link variants within an arbitrary string will be found. This means all four variants listed above will be matched.

### **How to read a regular expression?**

1. <mark style="color:red;">**`(https?:\/\/)`**</mark> — everything within parentheses forms a matching group. In the example, there are 4 such groups;
2. <mark style="color:red;">**`\`**</mark> — escape symbol. The following symbols need to be escaped: <mark style="color:red;">**`. ^ $ * + ? { } \ | ( )`**</mark>, as they are special symbols in the regular expression language;
3. <mark style="color:red;">**`?`**</mark> — the so-called "lazy" quantifier. Since a regular expression, by default, uses "greedy" matching (i.e., it tries to match as much as possible), this quantifier limits the search only up to the specified value before it. For instance, <mark style="color:red;">**`https`**</mark> in the first usage and <mark style="color:red;">**`https://`**</mark> in the second;
4. <mark style="color:red;">**`([\da-z\.-]+)`**</mark> — the second matching group. Square brackets indicate that the matching should be done character by character;
5. <mark style="color:red;">**`\d`**</mark> — denotes digits (0-9);
6. <mark style="color:red;">**`a-z`**</mark> — a range of checked letters, and <mark style="color:red;">`-`</mark> is also a character that needs to be matched;
7. <mark style="color:red;">**`+`**</mark> — a special character requiring the matching to be done as many times as needed, meaning **from 1 to an infinite number** **of times** for the presence of characters;
8. <mark style="color:red;">**`([a-z\.]{2,6})`**</mark> — the third matching group. <mark style="color:red;">**`{2,6}`**</mark> is a range indicating the number of character matches. In this case, it's from 2 to 6 since we're checking the top-level domain, which can contain 2 to 6 characters, and an infinite number of checks is not necessary;
9. <mark style="color:red;">**`([\/\w\.-]*)*\/?`**</mark> — the fourth matching group. <mark style="color:red;">**`\w`**</mark> checks any character from a to z in both lower and upper case, as well as digits;
10. <mark style="color:red;">`*`</mark> - a special character requiring the matching to be done as many times as needed, meaning **from 0 to an infinite number** **of times** for the presence of characters.
