Makestation

Full Version: Project Euler and Software Optimization (Problem 4 & Palindromes)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I play around with Project Euler from time to time. These are programming problems that are designed to be difficult to solve, but not unsolvable with a modest computer hardware and a good algorithm. The trick is to come up with a good algorithm to do the job. Therein lies the problem. Algorithms are hard to design, especially if they are highly efficient. Finna 

I usually do these in Python. I also do very easy problems, and try to optimize them as much as possible. Python, unfortunately, is an interpreted language. It does not compile to machine code. As a result, average Python code is at least 10x slower than equivalent C code, and often slower. 

That being said, my challenge to myself was to make this problem run as quickly as possible, and to make it run with relatively large numbers. This problem is based on Problem 4, but I've expanded it somewhat. Instead of using two three digit numbers to find a palindrome, it uses two numbers of arbitrary length. I've tested it with as many as 10 digits, and found that it runs in fairly reasonable time constraints. 

My python code: https://www.dropbox.com/s/hoekt73ioovwq0...me.py?dl=0

This is technically an O (10^n)^n algorithm, so it is, quite literally, doubly exponential in runtime. As the number of digits increases, the runtime increases quite severely. I have yet to find a better algorithm to do the job, so I tried to optimize the one I had as best I could. I've asked as many math majors as I could. I've looked all over the internet for better ways of approaching it. I have yet to crack the problem. It seems like trial and error is the only way to really attack it, as least as of so far. 

What this does to run faster: 
  • I multithreaded this. The trick to doing this is to use processes instead of threads. This gets around the GIL lock that ordinarily prevents this, and allowed Python to properly use all available cores. 
  • The algorithm starts by checking the largest numbers first, as it only wants the largest palindrome possible. Generally, the factors of the largest possible palindrome are very easy to find at the top of the possible number range. By starting at the top, runtime is improved significantly. 
  • As it turns out, every single even-digit palindrome has a factor of 11. This algorithm only ever needs to check every 11th number, making it run ~11 times faster. 
  • It runs as a loop within a loop (hence the doubly exponential time). However, it assumes that the first factor will always be the larger of the two factors, since 997 x 993 == 993 x 997. Thus, if it checks a combination of numbers, it doesn't check the reverse of this combination, which would have the same result. 
  • If it finds a palindrome at 993 x 990, it cannot necessarily stop evaluating lower numbers. If, for example, 992 x 995 was a palindrome, it would be a higher palindrome than the first one would be. However, it has a heuristic to detect if there is possibly any higher palindrome than the highest one it has found yet, using only numbers it hasn't checked. As soon as it detects that no higher palindrome could be possible, it stops evaluating. This massively improved performance compared to the unoptimized code. 
  • It "guesses" on possible values to find the first palindrome. The highest palindrome is almost always made up of two factors that are very near the top of the number range that is given. As a result, it does not attempt to calculate 999 * (any number from 999 to 1). It stops at 999 * (some number around 900-990, depending on settings), and then moves on to 998 * (similar range as previous iteration). 
  • By "guessing" in the most likely range first, it almost always finds palindromes much more quickly. If "guessing" fails to find palindromes, it falls back to checking the full range. It rarely does this. There's almost always a palindrome near the top of the range. 
  • Once it's finds the first palindrome, it can check a smaller range of numbers, since it only needs to check combinations that, when multiplied, are higher than the previous highest palindrome. As a result, this algorithm goes much faster once the first palindrome was found. (This is the reason that "cheating" and guessing on palindromes increases performance so drastically. )
Someone else did another deep dive into this very same problem, and posted their findings and code. I implemented very similar optimizations in my own code, with a few differences. His runs much faster than mine. He is doing it in a much better programming language for performance, and it shows. 

As much as I spent some time diving into this one, I have yet to find a faster algorithm. If anyone well-versed in math has a better solution, please give me a shout out. I'd be interested in taking suggestions and learning further ways to optimize it! 

In the meantime, happy programming. If anyone wants to see how to multithread python, take a look at my code. It really is that easy. Finna
Quote:Someone else did another deep dive into this very same problem, and posted their findings and code.
Based rustposter.
A friend lent me a book on it a month ago or so, it's incredible how fast LLVM can be compared to C, in a few tests I've found that it compiles even better than C.
That being said, if you do want to make it extremely fast, then you also need to wrap everything in unsafe{} which scares the shit out of every rust developer.

Cool that you added concurrency though. I was never able to get a good grasp of that, especially when trying to make everything asynchronous. Spawning threads that do stuff was the easy part, but making sure they didn't interfere with each other was a whole other issue. To be fair I only really ever did try in C++ and Java, both of which seemed a lot more complicated than what Python does here.

Do you reckon that removing all the time benchmarking would improve performance at all? I know that most of it's just fetching values and doing arithmetic, but you do it in each call to palindromes() and main() which I'm assuming would add up to a lot of extra instructions. Perhaps remove them and just run the script in the linux 'time' pipeline?
Code:
time python script.py
Something like that.

Otherwise, your optimization is probably as good as it's gonna get. Then again, I'm no math major, but it seems there's a lot less of a pattern for finding palindromes than, say, finding prime numbers.
Actually, I wondered about this. It wouldn't make much of a difference. It literally records the current time (to the millisecond level) at the start of the script, and then records it as soon as it finds a palindrome. It only tests for the time maybe six or seven times throughout the whole script execution (which maybe adds up to about 0.1 seconds at most). In other words, it does add considerable time for very small tests, but once you start checking 7+ digits, it's neglible compared to the rest.

And what you've mentioned about rust has me wanting to try it. Python is absolutely not the language to be using for this sort of thing. Multithreading and PyPy helped, but it's still python.

Incidentally, I tried out Cython recently. It's python, but you simply have to declare your types for functions and variables. And boy, oh boy, is it faster. On the order of about 20 times faster. It might just be my go-to from here on out. Finna
(February 12th, 2020 at 5:37 AM)Darth-Apple Wrote: [ -> ]Actually, I wondered about this. It wouldn't make much of a difference. It literally records the current time (to the millisecond level) at the start of the script, and then records it as soon as it finds a palindrome. It only tests for the time maybe six or seven times throughout the whole script execution (which maybe adds up to about 0.1 seconds at most). In other words, it does add considerable time for very small tests, but once you start checking 7+ digits, it's neglible compared to the rest.

And what you've mentioned about rust has me wanting to try it. Python is absolutely not the language to be using for this sort of thing. Multithreading and PyPy helped, but it's still python.

Incidentally, I tried out Cython recently. It's python, but you simply have to declare your types for functions and variables. And boy, oh boy, is it faster. On the order of about 20 times faster. It might just be my go-to from here on out. Finna

Ah, fair enough on the timing thing. I figured you'd be running into way more palindromes than that to make it worthwhile.

And mind you, Rust has one hell of a learning curve. Coming from a predominantly C-oriented background, it really screwed with me, like the concept of shadowing variables and everything being a const by default unless you add &mut to everything that you're going to be changing a lot. Not to mention Result<t> or Option<t> wrappers which although make for safer code and easier handling of errors, I was used to the classic 'retrieving error messages from memory using pointers to where the function f*** up' as OpenGL loves to do in C++.

Also yeah Cython is based, especially just loading C code into Python through its wrapper. Yeah, implicit interpretation of variables can get really f*** timing-wise, just look at how badly JS performs compared to literally anything else. At least with Python you don't get shit like this:
Code:
console.log('5' + 3); // prints 53
console.log('5' - 3); // prints 2

//Go on, open your browser console, I know you want to try it...

And you'll run into the occasional type error like if you ran:
Code:
num = input()
print(num + 3)
# str + int = not allowed :(
Since at least it does differentiate between types instead of implicitly just using them how it seems fit.

But with less processing put towards interpretation, more processing goes towards your actual logic to make stuff fast. Definitely pick up a native language if you're interested in making fast code. Seems you have a really good handle on programming/logic/Python already so it's the natural next step.
The exact thing you describe with JavaScript drives me up a wall with PHP. It has many of the same issues. It’s hard to predict what something will do if you’re used to how it works in another language. My biggest issue with PHP is how it lets you check a variable that’s undefined in a conditional branch statement. It shouldn’t let you do this. There will almost always be a bug if the variable you have in an IF statement isn’t defined. But PHP just throws a notice at you and continues like nothing happened. If your host isn’t configured to display these, you won’t even see it. And if it is, you’ll see a lot of notices from other people’s poor code.

Every programming language has a trade off I suppose. If you could make python fast out of the box, it’d be a godsend for a lot of people. Right now, Java is what you use if you want portability and speed, but Java is much harder and has a higher learning curve.

But as you say, native is the way to go for speed. Cython is about to be my go to if I can get away with it. I’m not uncomfortable with C, but I can still write code much faster in Python, even if I have to declare types. Finna