Using Regular Expressions in Erlang

On August 26, I will share my experience of using regular expressions in Erlang. I invite everyone who is interested in this topic to visit.

2 Likes

The conference was rich and informative. I hope that everyone who wants to attend it got access to it.

I continue my research on regular expressions and started adapting the code examples from the book ā€œRegular Expressions Cookbook, 2nd Editionā€. The authors are real gurus of regular expressions, share their deep experience. It is very interesting to apply experience, ideas, achievements of the regular expressions experts to the possibilities of re module.

2 Likes

I tune Erlang documentation - add information related to the topic of union, intersection and subtraction of character classes.

I hope this material will be available in the next OTP version.

2 Likes

I am impressed with the global approach of the authors of the book ā€œRegular Expressions Cookbook, 2nd Editionā€. They analyzed the possibilities provided by the standard libraries of programming languages (such as C#, Java, JavaScript, Perl, PHP, Python, Ruby and VB.NET) (which come with the compiler in the delivery) and offered readers solutions to typical problems that a programmer encounters when working with regular expressions.
Unfortunately, the authors did not include Erlang and even less Elixir there. I hope that if the next edition of this book comes out, then this shortcoming will be eliminated (at least I will offer this to the authors of the book).

I have already adapted the second and most of the third chapter to Erlangā€™s capabilities. I want that Regular Expressions to help solve complex problems, simplifying sometimes non-trivial solutions. I have already succeeded in implementing functions that I could not find in Erlang standard library ( match_chain/2, match_evaluator/3) - (I implement them myself and share the results of my research in the re_tuner library).

2 Likes

When working with regular expressions, you need to be aware that line endings are fraught with serious danger. If the parameters of the end of the line, the handler launch parameter does not agree with the content of the text being examined, then the resulting data will be incorrect.

Linux \n:
11

Windows \r\n:
22

To prevent possible errors, you can use two methods:

  1. affecting the input text,
  2. setting parameters for regular expression execution.

In my opinion, it is more convenient to use the first method.


-spec sanitize_text(Text) -> Result
   when 
        Text :: string(),
		Result :: string().

sanitize_text(Text) when is_list(Text)->
    SanitizedText = string:replace(Text, [$\r,$\n], [$\n], all),
    SanitizedText.

Source code.

2 Likes