Skip to main content
Deno 2 is finally here 🎉️
Learn more

RegExpressed

Descriptive functions to create regular expressions.

Readability

Regular expressions are nutorious for being hard to read. Part of this issue is that many features use single characters (+,*,?, etc.). It is hard to remember what each of them does and whenever you need to match a special character they need to be escaped.

This library attempts to solve this issue by utilizing variables, functions and tagged template literal to generate a RegEx. Meaning you can use vanilla JS to write your RegEx without having to learn all these special characters. No need to escape special characters, except those needed by javascript.

Examples

The comment represents the produced regex.

Word

// /(?:\w)+/
oneOrMore`${wordChar}`

Simplified email, reuses the word regex

// /(?:\w)+/
const word = oneOrMore`${wordChar}`

// /(?:\w)+@(?:\w)+(?:(?:\.nl)|(?:\.com))/
regex`${word}@${word}${or(".nl", ".com")}`

Discord invite

const host = or(
  `discord.gg`,
  `discord.media`,
  `discord.com/invite`,
  `discordapp.com/invite`
);

const inviteCode = quantity(0, 12)`${wordChar}`;

// /(?:(?:discord\.gg)|(?:discord\.media)|(?:discord\.com\/invite)|(?:discordapp\.com\/invite))\/(?:\w){0,12}/g
regexFlag({ global: true })`${host}/${inviteCode}`;

URL

const urlChar = charset`-@:%._+~#=${range("a","z")}${range("A","Z")}${range("0","9")}`;
const protocol = regex`http${optional`s`}://`;

const domain = between(2, 256)`${urlChar}`;
const domainExt = between(2, 6)`${charset`${range("a", "z")}`}`;
const host = regex`${optional`www.`}${domain}.${domainExt}`;

const pathChar = or(urlChar, charset`()?&/=`);
const path = zeroOrMore`${pathChar}`;

// (manually simplified)
// /https?:\/\/(?:www\.)?[\-@:%._+~#=a-zA-Z0-9]{2,256}\.[a-z]{2,6}[\-@:%._+~#=a-zA-Z0-9()?&/=]*/
regex`${protocol}${host}${path}`;

Tradeoffs

In terms of functionality there is no tradeoff, all regex features have an equivalent in this library. Most functions map directly to the same regex you normally would write by hand. The exception being that all quantifiers wrap the expression into an nonCaptureGroup (?:some-pattern). This can generate a longer regex. It would be possible to figure out which groups are not neccesary and safely remove them. For now this is not available though.

Another tradeoff is that it will be more code, however I do not see this as a total negative. RegEx are hard to read and can often be left untouched for a long time. Meaning that you probably completely forgot how it worked when reading it months later.

This is where splitting the pattern into parts and having descriptive functions will help. When reading the URL example it would probably be understandable. While I would have ended up coping the raw regex to a site like https://regexr.com/ to try and understand it.

Documentation

Each function has a small comment explaining what it does, most descriptions are almost directly taken from https://regexr.com/. Want to get some more information look at the MDN page

License

RegExpressed is released under the MIT License.