PAQ8PX ported to ARM processors
Hi everyone,
Over the past few days, I worked on porting the PAQ8PX compressor for ARM CPUs. Initially, it would not compile because it relied on some SSE instructions that are not available on ARM. This required porting some code to equivalent functions in this architecture. Once this was done, paq8px happily compiled and ran on the Snapdragon 845 and 865 CPU’s from my Samsung Galaxy S9+ and S20 Ultra smartphones:
I tested it using the Termux app running Ubuntu from the AndroNix project.
The changes that were done to make it compile mostly involved adding some #elif
directives. These were added below the initial #if
directives that checked if we were compiling on i386 or x86_64 architectures. This #elif
directive checks to see if the platform is ARM. The reason is that the header immintrin.h
doesn’t exist on ARM. Instead, we need to include arm_neon.h
and make the appropriate changes to convert the SSE/SSE2 instructions to NEON.
There were 2 main files that had SSE2/SSSE3 instructions that needed to be ported to NEON. In reality, these ports were not needed, as PAQ8PX contains code that doesn’t depends on the instruction set, but porting it makes the software take advantage of the CPU to its potential.
I started by doing the changed to the Mixer.hpp
file, which had code for SSE2 and AVX2.
After porting the code, you can see 2 new functions along with some SSE2 to NEON helper functions were added:
SimdMixer.hpp
and MixerFactory.cpp
needed some additional else if
and some additional conditions in order to support the NEON code:
SimdMixer.hpp
MixerFactory.cpp
The other file that contained SSSE3 instructions and needed to be ported to NEON was Bucket.hpp
. I also did a mayor refactoring to let the code which SIMD to use according to the system spec or the user-specified SIMD to use:
Ported the SSSE3 code to AVX2:
The SSSE3 function is now called findSsse3
:
And here we have the NEON code:
If no SIMD is specified or detected, the code will run findNone
:
Finally, here’s the find
function. It was extended to accept a second argument indicating which SIMD to use:
This change required adding this second argument to the corresponding lines of ContextMap.cpp
and ContextMap2.cpp
.
simd.hpp
This file required adding some #ifdef
defines:
Then, I’m returning the value 11 if we have compiled paq8px on an ARM processor:
Finally, the paq8px.cpp
file was updated to accept the NEON argument when using the -simd
parameter.
And those were the main changes done to paq8px.
Currently, the latest version v187, which completes porting the SSE2/SSSE3 code to NEON, and fixes some bugs from previous versions.
You can follow paq8px development on the encode.su forum by clicking here.