Blog Post

Azure SQL Blog
5 MIN READ

Unlocking the Power of Regex in SQL Server

abhimantiwari's avatar
abhimantiwari
Icon for Microsoft rankMicrosoft
May 19, 2025

Regular expressions, or regex, are powerful tools for text manipulation and pattern matching. They are incredibly useful in various scenarios, such as data validation, extraction, and transformation. SQL Server 2025 introduces native support for regex functions, making it easier to perform complex text operations directly within SQL queries. This Regex support is available in SQL Server 2025 public preview, Azure SQL DatabaseSQL DB in Fabric, and Azure SQL Managed Instance. The table-valued functions (TVFs) are currently available on SQL Server 2025 public preview however they will follow on other offerings as well in the coming weeks, once deployment is complete.

Advantages of using Regex in SQL queries

Regular expressions offer several advantages when used in SQL queries:

  • Flexibility in Pattern Matching: Regular expressions allow for complex and flexible pattern matching, making it easier to search for specific text patterns within large datasets.
  • Efficiency in Data Manipulation: Regex functions can efficiently manipulate and transform text data, reducing the need for multiple string functions and simplifying queries.
  • Improved Data Validation: Regular expressions can be used to validate data formats, ensuring that data conforms to specific patterns, such as email addresses, phone numbers, and zip codes.
  • Enhanced Data Extraction: Regex functions can extract specific parts of text data, making it easier to retrieve valuable information from unstructured data.
  • Standardization and Cleaning: Regex can help standardize and clean data by removing unwanted characters, whitespace, and duplicates.

Regex functions –

REGEXP_LIKE

The REGEXP_LIKE function checks if a string matches a pattern. It’s useful for data validation and filtering rows based on specific patterns. This function can be used in the WHERE/ HAVING clause, CHECK constraints, CASE expressions or conditional statements etc.

Example:

Scenario #1: This query finds all employees with valid email addresses.

SELECT [Name], Email
FROM Employees
WHERE REGEXP_LIKE(Email, '[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');

Scenario #2: Create Employees table with CHECK constraints on ‘Email’ and ‘Phone_Number’ columns, to ensure data meets the specified criteria.

DROP TABLE IF EXISTS Employees
CREATE TABLE Employees (
    ID INT IDENTITY(101,1),
    [Name] VARCHAR(150),
    Email VARCHAR(320)
	CHECK (REGEXP_LIKE(Email, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$')),
	Phone_Number NVARCHAR(20)
	CHECK (REGEXP_LIKE (Phone_Number, '^(\d{3})-(\d{3})-(\d{4})$'))
);

REGEXP_COUNT

The REGEXP_COUNT function counts the number of times a pattern appears in a string. It’s useful for analyzing text data and identifying patterns.

Example:

This query counts the number valid email addresses for each domain.

SELECT REGEXP_SUBSTR(Email, '@(.+)$', 1, 1,'c',1) AS Domain, COUNT(*) AS Valid_email_count
FROM Employees
WHERE REGEXP_COUNT(Email, '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$') > 0
GROUP BY REGEXP_SUBSTR(Email, '@(.+)$', 1, 1,'c',1);

REGEXP_INSTR

The REGEXP_INSTR function finds the position of a pattern in a string. It’s useful for locating specific substrings within text data.

Example:

This query finds the position of the @ sign in each employee’s email address.

SELECT Name, Email, REGEXP_INSTR(Email, '@') AS Position_of_@ FROM Employees;

REGEXP_REPLACE

The REGEXP_REPLACE function returns a modified string replaced by a ‘replacement string’, where occurrence of the regular expression pattern found. You can use it to modify or transform text data based on a regex pattern.

Example:

This query will redact sensitive data of variable length and format from a table that contains personal information like Social Security Numbers (SSNs) and credit card numbers.

SELECT sensitive_info,
       REGEXP_REPLACE(sensitive_info, '(\d{3}-\d{2}-\d{4}|\d{4}-\d{4}-\d{4}-\d{4})', '***-**-****') AS redacted_info
FROM personal_data;

REGEXP_SUBSTR

The REGEXP_SUBSTR function extracts a part of a string that matches a pattern. It’s useful for extracting specific information from text data.

Example:

This query extracts the domain part of valid email addresses.

SELECT Name, Email, REGEXP_SUBSTR(Email, '@(.+)$', 1, 1, 'i', 1) AS Domain
FROM Employees
WHERE REGEXP_LIKE(Email, '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$');

REGEXP_MATCHES

The REGEXP_MATCHES function finds all the matches of a pattern in a string and returns them in a tabular format (i.e. as rows and columns). It’s useful for extracting multiple occurrences of a pattern. It is typically used to find all occurrences of a pattern or to retrieve parts of a string using capture groups, especially when you expect multiple matches per input.

Example:

This query finds all key-value pairs including substring-matches from the string and returns them in tabular format.

SELECT * 
FROM REGEXP_MATCHES('Name: John Doe; Email: john.doe@example.com; Phone: 123-456-7890', '(\w+):\s*([^;]+)');

Results:

match_idstart_positionend_positionmatch_valuesubstring_matches
1114Name: John Doe[{“value”:”Name”,”start”:1,”length”:4},{“value”:”John Doe”,”start”:7,”length”:8}]
21743Email: john.doe@example.com[{“value”:”Email”,”start”:17,”length”:5},{“value”:”john.doe@example.com”,”start”:24,”length”:20}]
34664Phone: 123-456-7890[{“value”:”Phone”,”start”:46,”length”:5},{“value”:”123-456-7890″,”start”:53,”length”:12}]

REGEXP_SPLIT_TO_TABLE

The REGEXP_SPLIT_TO_TABLE function splits a string based on a pattern as the delimiter and returns the fragments as rows in a table. It’s useful for splitting strings with a list of items or breaking down text into words or sentences.

Example:

This query splits the string into parts based on semicolon and space characters and returns them in a tabular format.

SELECT * 
FROM REGEXP_SPLIT_TO_TABLE('Name: John Doe; Email: john.doe@example.com; Phone: 123-456-7890', '; ');

Results:

valueordinal
Name: John Doe1
Email: john.doe@example.com2
Phone: 123-456-78903

 

Note

REGEXP_LIKEREGEXP_MATCHES and REGEXP_SPLIT_TO_TABLE functions are available only under compatibility level 170 and above. If your database compatibility level is lower than 170, SQL Server can’t find and run these functions. Other regular expression functions are available at all compatibility levels. You can check compatibility level in the sys.databases view or in database properties. You can change the compatibility level of a database with the following command:

ALTER DATABASE [DatabaseName] SET COMPATIBILITY_LEVEL = 170;

 

If your current database compatibility level is below 170 and you have a use-case where you want to leverage the REGEXP_LIKE function but need additional time to perform the complete analysis and testing before switching the compatibility level, you can use REGEXP_COUNT or REGEXP_INSTR functions as an interim solution.

Example:

SELECT	[Name], Email
FROM	(VALUES('John Doe', 'john@contoso.com'),
	('Alice Smith', 'alice@fabrikam.com'),
	('Bob Johnson', 'bob@fabrikam.net'),
	('Charlie Brown', 'charlie@contoso.co.in')) as e (Name, Email)
WHERE	REGEXP_COUNT(e.email, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$') > 0
--OR	REGEXP_INSTR(e.email, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$') > 0;

Availability

All the intrinsic functions and TVFs are available in SQL Server 2025 public preview. The intrinsic functions are also available in Azure SQL Database, SQL DB in Fabric, and Azure SQL Managed Instance. The TVFs will follow in the future once deployment is completed.

To learn more about the feature, please visit – learn page.

 

Hint
For those who find writing complex regex patterns challenging, leveraging Copilot can be a game-changer. Copilot can assist in generating regex patterns based on your requirements, making it easier to implement and test them in your SQL queries.
Try asking Copilot for regex patterns!

 

Conclusion

Regular expressions in SQL Server are powerful tools that can make your data tasks easier and more efficient. By bringing regex support natively, SQL Server lets you push more of these tasks into the database layer, which can simplify application code and enable new kinds of queries. Start simple with these functions, test your patterns, and soon you’ll be slicing and dicing strings in SQL with ease.

We encourage you to explore these functions and provide your valuable feedback. Stay tuned for more updates and enhancements in the future.

Thank you for being a part of our community and helping us make Azure SQL and SQL Server the best platform for your data needs!

Updated May 19, 2025
Version 1.0
No CommentsBe the first to comment