PAQ8PX ported to ARM processors

PAQ8PX ported to ARM processors

Hi everyone,

Over the past few days, I worked on porting the PAQ8PX compressor for ARM CPUs. Initially, it would not compile because it relied on some SSE instructions that are not available on ARM. This required porting some code to equivalent functions in this architecture. Once this was done, paq8px happily compiled and ran on the Snapdragon 845 and 865 CPU’s from my Samsung Galaxy S9+ and S20 Ultra smartphones:

paq8px_v187
paq8px_v186fix1 running on ARM

I tested it using the Termux app running Ubuntu from the AndroNix project.

The changes that were done to make it compile mostly involved adding some #elif directives. These were added below the initial #if directives that checked if we were compiling on i386 or x86_64 architectures. This #elif directive checks to see if the platform is ARM. The reason is that the header immintrin.h doesn’t exist on ARM. Instead, we need to include arm_neon.h and make the appropriate changes to convert the SSE/SSE2 instructions to NEON.

There were 2 main files that had SSE2/SSSE3 instructions that needed to be ported to NEON. In reality, these ports were not needed, as PAQ8PX contains code that doesn’t depends on the instruction set, but porting it makes the software take advantage of the CPU to its potential.

I started by doing the changed to the Mixer.hpp file, which had code for SSE2 and AVX2.

After porting the code, you can see 2 new functions along with some SSE2 to NEON helper functions were added:

SimdMixer.hpp and MixerFactory.cpp needed some additional else if and some additional conditions in order to support the NEON code:

SimdMixer.hpp

MixerFactory.cpp

The other file that contained SSSE3 instructions and needed to be ported to NEON was Bucket.hpp. I also did a mayor refactoring to let the code which SIMD to use according to the system spec or the user-specified SIMD to use:

Ported the SSSE3 code to AVX2:

The SSSE3 function is now called findSsse3:

And here we have the NEON code:

If no SIMD is specified or detected, the code will run findNone:

Finally, here’s the find function. It was extended to accept a second argument indicating which SIMD to use:

This change required adding this second argument to the corresponding lines of ContextMap.cpp and ContextMap2.cpp.

simd.hpp

This file required adding some #ifdef defines:

Then, I’m returning the value 11 if we have compiled paq8px on an ARM processor:

Finally, the paq8px.cpp file was updated to accept the NEON argument when using the -simd parameter.

And those were the main changes done to paq8px.

Currently, the latest version v187, which completes porting the SSE2/SSSE3 code to NEON, and fixes some bugs from previous versions.

You can follow paq8px development on the encode.su forum by clicking here.