Skip to main content
Solved

Regular Expressions (Regex) in TimeXtender


Hello community,

 

Is it possible to do RegEx with TimeXtender? Or use Python code somewhere in TX that supports the library re? 

Our business case:
-  We have a database with text reviews of varying lengths, where personal information is sometimes available. We want to read the database in and anonimise the values which contain personally identifiable information. 

For example: "oh no, ZuzaGlog doesn't know how TimeXtender works” and we want to make it "oh no, XXXXXX doesn't know how TimeXtender works”.

 

We currently do it in Python with the library re, and a list of possible personally identifiable information. We search through the strings with reviews using the keywords from the list, and replace hits with XXXXXX. 
Now we want to build this solution in TimeXtender. 

In what way would this be possible?

Best answer by rory.smith

Hi @ZuzaGlog ,

if you are doing this interactively using code in Python then your Python application's output is basically a data source. In principle, what you are doing borders on something I would do in a business process and not in a data platform. You might consider using a Master Data Management tool to deal with the fuzzy logic required.

That being said, you do have options: the LIKE clause gives some RegEx-like support, see: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-ver16 . If you are not using Azure SQL DB, you can add a CLR runtime that gives you RegEx capabilities: https://techcommunity.microsoft.com/t5/modernization-best-practices-and/sql-server-regular-expressions-library-sample/ba-p/3101875 . Full regular expression support is on the roadmap for SQL Server, but has not been released yet.

You can also call your Python code from PowerShell and weave that into your execution process. Finally, you could develop your own connector in C# that includes this kind of logic.

View original
Did this topic help you find an answer to your question?

2 replies

rory.smith
TimeXtender Xpert
Forum|alt.badge.img+7
  • TimeXtender Xpert
  • 686 replies
  • Answer
  • November 3, 2023

Hi @ZuzaGlog ,

if you are doing this interactively using code in Python then your Python application's output is basically a data source. In principle, what you are doing borders on something I would do in a business process and not in a data platform. You might consider using a Master Data Management tool to deal with the fuzzy logic required.

That being said, you do have options: the LIKE clause gives some RegEx-like support, see: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-ver16 . If you are not using Azure SQL DB, you can add a CLR runtime that gives you RegEx capabilities: https://techcommunity.microsoft.com/t5/modernization-best-practices-and/sql-server-regular-expressions-library-sample/ba-p/3101875 . Full regular expression support is on the roadmap for SQL Server, but has not been released yet.

You can also call your Python code from PowerShell and weave that into your execution process. Finally, you could develop your own connector in C# that includes this kind of logic.


Thomas Lind
Community Manager
Forum|alt.badge.img+5
  • Community Manager
  • 1070 replies
  • November 3, 2023

Hi @ZuzaGlog 

It is a good issue. With GDPR and everything.

I see this as more of an idea. I will convert this to that, but I would like some more info about how you want it to work and why you need it.

So how you would like it to behave in TimeXtender? I feel like the why you want it is added, but it could probable be expanded a bit. You mention python being an option. I would like more how you could imagine it working when you want to mask/change private data in a row/table. It does not need to be complicated to do this.

 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings