Finding Multiples of Four Using Regular Expressions
Finding multiples of four using regular expressions came up recently in a discussion on stack overflow, but none of the replies answered the question except to say that the original poster needed to use a different approach to solve the problem. The poster simply wanted to know how to determine multiples of four using a regex. Being the type of person that can’t let a mystery go unsolved, I became determined to solve this dilemma using a regular expression. Below is what came from my efforts. Obviously this is not the most practical way to determine multiples of four, but I think this is an interesting question if for no other reason than that it presents a cool challenge.
Ultimately, regex is just a pattern matcher, right? So what types of patterns might be created by numbers in multiples of four? To answer this question, I wrote a quick Python script to print out all multiples of four from 1 to 500 ( try it ;).
for i in range(0,500, 4): if i%100 == 0: print print ("%03d |" % (i)),
000 | 004 | 008 | 012 | 016 | 020 | 024 | 028 | 032 | 036 | 040 | 044 | 048 | 052 | 056 | 060 | 064 | 068 | 072 | 076 | 080 | 084 | 088 | 092 | 096 | 100 | 104 | 108 | 112 | 116 | 120 | 124 | 128 | 132 | 136 | 140 | 144 | 148 | 152 | 156 | 160 | 164 | 168 | 172 | 176 | 180 | 184 | 188 | 192 | 196 | 200 | 204 | 208 | 212 | 216 | 220 | 224 | 228 | 232 | 236 | 240 | 244 | 248 | 252 | 256 | 260 | 264 | 268 | 272 | 276 | 280 | 284 | 288 | 292 | 296 | 300 | 304 | 308 | 312 | 316 | 320 | 324 | 328 | 332 | 336 | 340 | 344 | 348 | 352 | 356 | 360 | 364 | 368 | 372 | 376 | 380 | 384 | 388 | 392 | 396 | 400 | 404 | 408 | 412 | 416 | 420 | 424 | 428 | 432 | 436 | 440 | 444 | 448 | 452 | 456 | 460 | 464 | 468 | 472 | 476 | 480 | 484 | 488 | 492 | 496 |
At first, I noticed that the last digit of each number was alternating between 0 4 8 2 6. Now, you might be tempted to use this and just check all strings of digits to see if they end in one of these numbers. However, that won’t work because other integers that aren’t divisible by four (such as 10, 14, 18, 22, 26, etc.) also end with those digits . . . and so the search continued. Next I looked at the last two digits and noticed a repeating pattern between 0 and 100:
If you prefix the single digits with zeros, you’ll notice that this pattern repeats every increment of 100. So then I felt pretty confident that I was on to something. To test my theory further, I pulled up Google and typed in 2147483648 % 4 (which is the next highest number past the maximum 32-bit signed int value, which is divisible by 4). This was just the first arbitrary value that came to mind and has no other significance that I’m aware of. Though, as it turns out, 2147483648 % 4 = 0, so that made me feel really good. I suppose you could actually write out a mathematical proof and prove that this theory works, but I’m more into application. So I figured at this point, all I had to do was write up this regex and then test it against the output of the program written above. So my next goal was to write the actual regex.
If you notice, I conveniently made the program print out the `OR` operator so I could just cut and paste most of the regex and I’d be halfway home. All I wanted were the last two digits, so the first part of my regex looked something like this:
You’ll see I prefixed the zeros to the single digits and added 00 to the front. Again, this is because I wanted to match the last TWO chars, including the 00 from 100 (this will also return strings of 0 as a valid multiple of four, as it should). So then I had my regex suffix written. According to my theory, any string of digits suffixed by the aforementioned two digits is a multiple of four, so I just needed to write a rule for the prefix (any digit) and I was done. This was very easy and is simply [0-9]*. So now my regex looked like this:
I was almost finished, but what did I forget? Single digits!! 0, 4, and 8 will be rejected by the regex above since they are single digits and the above pattern only matches two digits preceded by 0 or more digits. So I had to tweak the regex a little and I ended up with this:
And that’s pretty much it. Technically, you would also have to add word boundaries since you want to treat the entire string of digits as a word. You would add boundary tags like this:
But whether or not you do that depends on your application. If you were going to use this in a lexer, you might be building with jflex; for instance, you might not want to include those since you could have other rules for similar lexemes.
So all in all, that’s how I would find multiples of four using regular expressions. That’s probably not the shortest, most concise regex and I’m sure there are better ways to do it, but if you’re looking for something quick and dirty, I don’t think it gets any quicker or dirtier.