Search This Blog
Talospace
The free computing frontier
Posts
Latest Posts
Updated Baseline JIT OpenPOWER patches for Firefox 128ESR
- Get link
- X
- Other Apps
Fedora 41
- Get link
- X
- Other Apps
Running Thunderbird with the OpenPower Baseline JIT
I wasn't able to get a full LTO-PGO build for Thunderbird to build properly so far with gcc (workin' on it), but with the JIT patches for ESR128 an LTO optimized build will complete and run, and that's good enough for now. The diff for the .mozconfig is more or less the following:
export CC=/usr/bin/gcc export CXX=/usr/bin/g++ mk_add_options MOZ_MAKE_FLAGS="-j24" #ac_add_options --enable-application=browser #ac_add_options MOZ_PGO=1 # ac_add_options --enable-project=comm/mail mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/tbobj ac_add_options --enable-optimize="-O3 -mcpu=power9 -fpermissive" ac_add_options --enable-release ac_add_options --enable-linker=bfd ac_add_options --enable-lto=full ac_add_options --without-wasm-sandboxed-libraries ac_add_options --with-libclang-path=/usr/lib64 export GN=/home/censored/bin/gn # if you haz export RUSTC_OPT_LEVEL=2
You can use a unified .mozconfig like this to handle both the browser and the E-mail client; if you do, to build the browser the commented lines should be uncommented and the two lines below the previously commented section should be commented.
You'll need comm-central embedded in your ESR128 tree as per the build instructions, and you may want to create an .hg/hgignore file inside your ESR128 source directory as well to keep changes to the core and Tbird from clashing, something like
^tbobj/ ^comm/
which will ignore those directories but isn't a change to .hgignore that you have to manually edit out. Once constructed, your built client will be in tbobj/. If you were using a prebuilt Thunderbird before, you may need to start it with tbobj/dist/bin/thunderbird -p default-release (substitute your profile name if it differs) to make sure you get your old mailbox back, though as always backup your profile first.
- Get link
- X
- Other Apps
Music production on Power: an adventure in porting
[Here's a guest post from taylor.fish on their porting work on music and audio software. I thought it made a good tutorial on porting and also is a great way to show off the diverse things people are doing on OpenPower. Like all guest and first-party posts on Talospace, this article remains the property of the original author and may be distributed under CC-BY-SA 4.0. -- Ed.]
For the past five years, I’ve used a Blackbird as my primary computing device. Prior to that I used x86 systems flashed with Libreboot, but aging hardware running unsupported firmware only gets you so far.
The switch to a truly owner-controlled Power ISA system was a welcome one, but it wasn’t without its growing pains. I could no longer assume any given piece of Linux-compatible software would successfully compile and run on my machine, due to the small but significant portion with architecture-specific issues. That didn’t deter me from continuing to use my Blackbird as my main device, but when I wanted to get back into music production, I knew I would have to confront this problem: my previous endeavors, although sticking entirely to free software, were all on x86.
Of particular importance in digital music production are plugins, which include instruments like synthesizers and samplers, effects like equalizers and reverb, and analysis tools like spectrograms and oscilloscopes. While there are some excellent plugins released as free software, they are vastly outnumbered by proprietary ones, and it takes dedication to commit to producing music only with free software. Not wanting that uphill battle to feel more like a cliff, when I discovered that some of those plugins wouldn’t run on Power, I was determined to change that fact rather than lose what few tools I had.
And so my porting adventure began. Over the past year or so I developed patches for every piece of free/libre audio software that I wanted to use but that didn’t work on Power. Some of those changes have been merged upstream, but for all the others I maintain a GitHub organization called PowerAudio that contains forked versions of each repository with ppc64le patches applied (along with some other improvements for use on GNU/Linux). If you just want working audio plugins on Power, visit that page, which has more information about each piece of ported software. If you want to know more about what the porting process entailed, however, read on…
The porting process
Each piece of software to be ported has its own unique set of problems, but there are some common themes. Here are some of the most frequent issues that prevent audio plugins from working on Power ISA systems:
1. Architecture-specific compiler options
This is the easiest issue to fix. Some projects pass architecture-specific
options to the compiler (like -msse
on x86) but don’t restrict those options
only to the relevant architectures. In that case the fix is simply to perform
an architecture check before applying those options, as in
this change to tap-lv2:
@@ -10 +10,3 @@
+ifneq ($(findstring $(shell uname -m),x86_64 amd64 i386 i686),)
CFLAGS += -mtune=generic -msse -msse2 -mfpmath=sse
+endif
Some projects handle specific architectures but then assume x86 as a fallback.
In that case, it’s easiest to continue the pattern and add ppc
as one of the
cases, as in this change to Helm:
@@ -25,4 +25,8 @@ ifneq (,$(findstring aarch,$(MACHINE)))
ifneq (,$(findstring arm,$(MACHINE)))
SIMDFLAGS := -march=armv8-a -mtune=cortex-a53 -mfpu=neon-fp-armv8 -mfloat-abi=hard
+else
+ifneq (,$(findstring ppc,$(MACHINE)))
+ SIMDFLAGS :=
else
SIMDFLAGS := -msse2
+endif
2. Assumptions broken by Power
Some code appears to be cross-platform but contains assumptions that don’t
apply to Power ISA systems. For example, the plugin framework DPF used the
first part of gcc -dumpmachine
to obtain the correct directory name for VST
plugins. On 64-bit little-endian PowerPC, that yields powerpc64le
. But the
VST 3 specification says the directory name should match uname
-m
, which is ppc64le
. These two identifiers happen to be identical for many
architectures, but generally differ on PowerPC and Power ISA systems.
Simply using uname -m
here would break cross-compilation, so
the fix performs a text substitution to correct the discrepancy
on Power:
@@ -691 +691,3 @@ ifeq ($(LINUX),true)
-VST3_BINARY_DIR = Contents/$(TARGET_PROCESSOR)-linux
+# This must match `uname -m`, which differs from `gcc -dumpmachine` on PowerPC.
+VST3_ARCHITECTURE := $(patsubst powerpc%,ppc%,$(TARGET_PROCESSOR))
+VST3_BINARY_DIR = Contents/$(VST3_ARCHITECTURE)-linux
A similar issue existed in JUCE, a popular framework for audio plugins. JUCE
performs architecture detection with a chain of preprocessor conditionals that
(somewhat hackily) use #error
to emit the correct architecture identifier,
which JUCE uses in places that expect the output of uname -m
. While those
conditionals do attempt to detect PowerPC, they don’t account for the fact that
the different endiannesses of 64-bit PowerPC have different identifiers, and
incorrectly classify ppc64le as ppc64. This can cause compilation to fail
entirely when JUCE uses the incorrect architecture name but then runs a
validator that expects the correct one.
A simple endianness check fixes this one:
@@ -64,3 +64,7 @@ #elif defined(__ppc__) || defined(__ppc) || ...
#if defined(__ppc64__) || defined(__powerpc64__) || defined(__64BIT__)
- #error JUCE_ARCH ppc64
+ #ifdef __LITTLE_ENDIAN__
+ #error JUCE_ARCH ppc64le
+ #else
+ #error JUCE_ARCH ppc64
+ #endif
#else
3. Lack of inclusion in platform-specific code
Some projects contain truly platform-specific code that needs to be written separately for each architecture, but don’t include PowerPC one of the handled cases. In these situations, the most straightforward (and sometimes only) fix is to add the necessary Power-specific code. For example, sfizz contained a copy of a low-level dependency that only supported x86 and ARM, but because that dependency had already added support for Power and other architectures upstream, all that was necessary for sfizz was to update the dependency.
Another example comes from DISTRHO Ports, a large collection of plugins ported to GNU/Linux, in which an architecture detection script required adding code to detect the various types of PowerPC:
@@ -42,4 +42,19 @@
elif echo "${fileout}" | grep -q "x86-64"; then
if [ "$(uname -m)" != "x86_64" ]; then
MESON_EXE_WRAPPER="qemu-x86_64-static"
fi
+
+ elif echo "${fileout}" | grep -q "64-bit LSB.*PowerPC"; then
+ if [ "$(uname -m)" != "ppc64le" ]; then
+ MESON_EXE_WRAPPER="qemu-ppc64le-static"
+ fi
+
+ elif echo "${fileout}" | grep -q "64-bit MSB.*PowerPC"; then
+ if [ "$(uname -m)" != "ppc64" ]; then
+ MESON_EXE_WRAPPER="qemu-ppc64-static"
+ fi
...
Although the purpose of this script is to aid cross-compilation, the lack of Power support prevented even native compilation.
4. Missing optional vector intrinsics
Hardware-specific vector intrinsic functions (SIMD) are often used to improve performance, but software that uses them must provide a separate implementation for each supported architecture, which rarely includes Power. However, some software is designed to use vector intrinsics only when such an implementation exists, falling back to a non-optimized, cross-platform approach otherwise. In practice, though, if non-optimized platforms don’t get much testing, bugs that unintentionally prevent compilation on these systems can go unnoticed.
This issue occurred in a dependency used by Wavetable that contained, but did not require, optimized SIMD code for x86–64 and ARM, but caused errors on other architectures by mistakenly trying to use their nonexistent SIMD implementations. Because SIMD was designed to be optional in this dependency, the fix simply adds an architecture check:
@@ -80,5 +80,7 @@
#ifdef JUCE_32BIT
#define GIN_HAS_SIMD 0
- #else
+ #elif defined(JUCE_INTEL) || defined(JUCE_ARM)
#define GIN_HAS_SIMD 1
+ #else
+ #define GIN_HAS_SIMD 0
#endif
Another example comes, again, from JUCE. JUCE contains a copy of libpng, a PNG library that actually contains optimized VSX code for Power! But in a cruel twist of irony, JUCE excluded that optimized code from their copy, yet kept the code that tries to use it. The result? Linker failures that, because they occur in a helper tool almost always compiled early in the build process, prevent almost all software that uses JUCE from compiling on Power.
The fix for this one comes from libpng itself, which demonstrates the proper way of disabling optimizations, by defining certain macros instead of just deleting the implementations. So these macros simply need to be added to JUCE, which… already defines one of them?
#define PNG_ARM_NEON_OPT 0
Indeed, because JUCE’s copy of libpng excludes the optimized routines for all architectures, this issue presumably appeared on ARM at some point and was fixed (x86 happens not to exhibit the problem because libpng optimizations are opt-in on that architecture). That would have been a great time to include the other macros to disable optimizations, but instead, that task is accomplished by this fix:
@@ -268 +268,4 @@
#define PNG_ARM_NEON_OPT 0
+ #define PNG_POWERPC_VSX_OPT 0
+ #define PNG_INTEL_SSE_OPT 0
+ #define PNG_MIPS_MSA_OPT 0
5. Missing required vector intrinsics
Finally, one of the most common sources of incompatibility with Power in audio software is the non-optional use of SIMD, necessitating separate implementations for each architecture. Unsurprisingly, support for Power is not typically included.
My preferred approach in this case is to use SIMDe, a cross-platform implementation of x86 (and ARM) vector intrinsics, optimized with the platform’s native vector operations when available. In the case of Vaporizer2, that looks something like this:
@@ -6,5 +6,8 @@
#ifdef __aarch64__ //arm64
#include "../../sse2neon.h"
-#else
+#elif defined JUCE_INTEL
#include "immintrin.h"
+#else
+ #define SIMDE_ENABLE_NATIVE_ALIASES
+ #include <simde/x86/sse3.h>
#endif
Something similar is done for Vitalium, the fork of Vital in DISTRHO Ports. However, when attempting to use SIMDe to implement the x86 intrinsics it uses, I encountered odd runtime errors with backtraces that involved SIMDe. More investigation is needed to determine the cause, but because Vitalium also provides implementations of its vector-optimized routines for ARM, an easier workaround was to configure SIMDe to implement ARM’s vector intrinsics (NEON) instead, which appears to avoid that issue.
@@ -33,8 +33,13 @@
#else
- static_assert(false, "No SIMD Intrinsics found which are necessary for compilation");
+ #warning "No native SIMD support; using SIMDe"
+ #define SIMDE_ENABLE_NATIVE_ALIASES
+ #include <simde/arm/neon.h>
+ #define VITAL_NEON 1
+ #define VITAL_SIMDE 1
#endif
-#if VITAL_SSE2
+#if VITAL_SIMDE
+#elif VITAL_SSE2
#include <immintrin.h>
#elif VITAL_NEON
#include <arm_neon.h>
[See also x86intrin.h -- Ed.]
Lastly, another example of this comes—yet again—from JUCE, in a form that prompts a different solution. JUCE contains its own set of SIMD functions, designed with an architecture-independent API but requiring architecture-specific implementations. Although Power is predictably unsupported, JUCE does contain cross-platform fallback implementations for almost all of its SIMD API; however, these are used only as part of the architecture-specific implementations, to implement operations that don’t have a precise native equivalent on a given platform.
Why not, then, provide a universal fallback implementation for all unsupported architectures? That’s exactly what this change does (with a diff too large to include here), fixing another source of compilation errors in projects that use JUCE.
Conclusion
When ARM devices running desktop operating systems started to become more common, in particular due to Apple’s decision to move away from Intel, I had hoped that this would encourage x86-only software to become architecture-independent, benefiting Power in the process. Unfortunately, many projects have simply special-cased support for ARM, even when cross-platform alternatives exist (which don’t preclude optimized architecture-specific routines if desired). Maybe another architecture will eventually become the catalyst for this change, but until then, we’ll need patches.
I will, however, take a moment to complain again about JUCE. Despite making me sign their Contribution License Agreement, JUCE has ignored all of my pull requests (juce-pr1, juce-pr2, and juce-pr3), even the simplest two that would not require much review (yet are the most important fixes for Power). Because of this, every project that uses JUCE must, at a minimum, use a patched version in order to compile on Power.
Still, I think the biggest takeaway is that music production is absolutely possible on Power. The experience is undoubtedly rough around the edges, but I hope that PowerAudio’s ports can reduce the discrepancy in available tools compared to other architectures, and generally make this activity more accessible to Power ISA users.
- Get link
- X
- Other Apps
Baseline JIT patches available for Firefox ESR128 on OpenPOWER
The OpenPOWER Firefox JIT still crashes badly in Wasm and Ion for reasons I have yet to ascertain, but the Baseline Interpreter and Baseline Compiler stages of the JIT continue to work great and are significantly faster than the baseline Interpreter (even in a PGO-LTO build), so I did the needful and finally got them pulled up to the new Extended Support Release which is Firefox 128.
I then spent the last two days bashing out crashes and bugs, including a regression from Firefox's new WebAssembly-based in-browser translation engine. The browser chrome now assumes that WebAssembly is always present, but on JIT-less tier-3 machines (or partially implemented JITs like ours, and possibly where Wasm is disabled in prefs) it isn't, so it hits an uncaught error which then blows up substantial portions of the browser UI like the stop-reload button and context menus. The Fedora official ppc64le build of Firefox 128.0.3 is affected as well; I filed bug 1912623 with a provisional fix. Separately all JIT and JavaScript tests completely pass in multiple permutations of Baseline Interpreter and Baseline Compiler, single- and multi-threaded.
As a sign of confidence I've been dogfooding it for the last 24 hours with my typical massive number of tabs and add-ons and can't get it to crash anymore, so I'm typing this blog post in it and using it to upload its own changesets to Github. Grab the ESR source from Mozilla (either pull a tree with Mercurial or just download an archive) and apply the changesets in numerical order, though after bug 1912623 is fixed you won't need #823094. The necessary .mozconfig for building an LTO-PGO build, which is what I'm using, is also in that issue; it's pretty much the same as earlier ones except for --enable-jit.
Little-endian POWER9 remains the officially supported architecture. This version has not been tested on POWER8 or big-endian POWER9, though the JIT should still statically disable itself even if compiled with it on, so the browser should still otherwise work normally. If this is not the case, I consider that a bug, and will accept a fix (I don't have a POWER8 system here to test against). There are no Power10 specific instructions, but I don't see any reason why it wouldn't work on a Power10 machine or on a SolidSilicon S1 whenever we get one of those.
Comments always solicited, though backtraces and reliable STRs are needed to diagnose any bug, of course. Meanwhile I've got more work cut out for me but at least we're back in the saddle for another go.
- Get link
- X
- Other Apps
Chromium Power ISA patches ... from Solid Silicon
Oh, yeah, it really is that Solid Silicon. You can make your own speculations from the commit log, though regardless of whether Solid Silicon is truly a separate concern or a Raptor subsidiary, it wouldn't be surprising that Raptor resources are assisting since they've kind of bet the store on the S1.
Timothy Pearson's comments in the Electron Github suggest that Google has been pretty resistant to incorporating support for architectures outside of their core platforms. This is not a wholly unreasonable position on Google's part but it's not a particularly charitable one, and unlike Mozilla, the Chrome team doesn't really have the concept of a tier-3 build nor any motivation to. That kind of behaviour is all the more reason not to encourage browser monocultures because it's not just the layout engine that causes vendor lock-in. Fortunately V8, the JavaScript engine, is maintained separately, and reportedly has been more accommodating presumably because of things like Node.js on IBM hardware (even IBM i via PASE!).
Mozilla is much more accepting of this as long as regressions aren't introduced. This is why TenFourFox patches were largely not upstreamed since they would potentially cause problems with Cocoa widgets in later versions of macOS, though what patches were generally applicable I would do so. The main reason I'm still maintaining the Firefox ppc64le JIT patches outside is because I still can't solve these recent startup crashes deep within Wasm code, which largely limits me to Baseline Compiler and thus is not suitable for loading into the tree yet (we'd have to also upstream pref changes that would adversely affect tier-1 until this is fixed). I still intend to pull these patches up to the next ESR, especially since Github is glacially slow now without a JIT and it's affecting my personal ability to do other tasks. Maybe I should be working on something like rr for ppc64le at the same time because stepping through deeply layered code in gdb is a great way to go stark raving mad.
- Get link
- X
- Other Apps