Posts

Latest Posts

Updated Baseline JIT OpenPOWER patches for Firefox 128ESR


I updated the Baseline JIT patches to apply against Firefox 128ESR, though if you use the Mercurial rebase extension (and you should), it will rebase automatically and only one file had to be merged — which it did for me also. Nevertheless, everything is up to date against tip again, and this patchset works fine for both Firefox and Thunderbird. I kept the fix for bug 1912623 because I think Mozilla's fix in bug 1909204 is wrong (or at least suboptimal) and this is faster on systems without working Wasm. Speaking of, I need to get back into porting rr to ppc64le so I can solve those startup crashes.

Fedora 41


Just a quick note that Fedora 41 is out, the standard Linux distro we use here on our two Raptor systems. Based on kernel 6.11, he big updates are updated packaging with DNF 5 and RPM 4.20 plus Perl 5.40, Python 3.13, LLVM 19 and gcc 14.1/glibc 2.40. Fortunately there are apparently still optional Xorg support packages for GNOME and KDE Plasma. Once the package repos have stabilized a bit, I'll have the usual minireview. For now, refresh your memory with how things went with Fedora 40, and of course Fedora 38 will be EOL in about a month based on their standard schedule.

Running Thunderbird with the OpenPower Baseline JIT


The issues with Ion and Wasm in OpenPower Firefox notwithstanding, the Baseline JIT works well in Firefox ESR128, and many of you use it (including yours truly). Of course, that makes Thunderbird look sluggish without it.

I wasn't able to get a full LTO-PGO build for Thunderbird to build properly so far with gcc (workin' on it), but with the JIT patches for ESR128 an LTO optimized build will complete and run, and that's good enough for now. The diff for the .mozconfig is more or less the following:

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"

#ac_add_options --enable-application=browser
#ac_add_options MOZ_PGO=1
#
ac_add_options --enable-project=comm/mail
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/tbobj

ac_add_options --enable-optimize="-O3 -mcpu=power9 -fpermissive"
ac_add_options --enable-release
ac_add_options --enable-linker=bfd
ac_add_options --enable-lto=full
ac_add_options --without-wasm-sandboxed-libraries
ac_add_options --with-libclang-path=/usr/lib64

export GN=/home/censored/bin/gn # if you haz
export RUSTC_OPT_LEVEL=2

You can use a unified .mozconfig like this to handle both the browser and the E-mail client; if you do, to build the browser the commented lines should be uncommented and the two lines below the previously commented section should be commented.

You'll need comm-central embedded in your ESR128 tree as per the build instructions, and you may want to create an .hg/hgignore file inside your ESR128 source directory as well to keep changes to the core and Tbird from clashing, something like

^tbobj/
^comm/

which will ignore those directories but isn't a change to .hgignore that you have to manually edit out. Once constructed, your built client will be in tbobj/. If you were using a prebuilt Thunderbird before, you may need to start it with tbobj/dist/bin/thunderbird -p default-release (substitute your profile name if it differs) to make sure you get your old mailbox back, though as always backup your profile first.

Music production on Power: an adventure in porting


[Here's a guest post from taylor.fish on their porting work on music and audio software. I thought it made a good tutorial on porting and also is a great way to show off the diverse things people are doing on OpenPower. Like all guest and first-party posts on Talospace, this article remains the property of the original author and may be distributed under CC-BY-SA 4.0. -- Ed.]

For the past five years, I’ve used a Blackbird as my primary computing device. Prior to that I used x86 systems flashed with Libreboot, but aging hardware running unsupported firmware only gets you so far.

The switch to a truly owner-controlled Power ISA system was a welcome one, but it wasn’t without its growing pains. I could no longer assume any given piece of Linux-compatible software would successfully compile and run on my machine, due to the small but significant portion with architecture-specific issues. That didn’t deter me from continuing to use my Blackbird as my main device, but when I wanted to get back into music production, I knew I would have to confront this problem: my previous endeavors, although sticking entirely to free software, were all on x86.

Of particular importance in digital music production are plugins, which include instruments like synthesizers and samplers, effects like equalizers and reverb, and analysis tools like spectrograms and oscilloscopes. While there are some excellent plugins released as free software, they are vastly outnumbered by proprietary ones, and it takes dedication to commit to producing music only with free software. Not wanting that uphill battle to feel more like a cliff, when I discovered that some of those plugins wouldn’t run on Power, I was determined to change that fact rather than lose what few tools I had.

And so my porting adventure began. Over the past year or so I developed patches for every piece of free/libre audio software that I wanted to use but that didn’t work on Power. Some of those changes have been merged upstream, but for all the others I maintain a GitHub organization called PowerAudio that contains forked versions of each repository with ppc64le patches applied (along with some other improvements for use on GNU/Linux). If you just want working audio plugins on Power, visit that page, which has more information about each piece of ported software. If you want to know more about what the porting process entailed, however, read on…

The porting process

Each piece of software to be ported has its own unique set of problems, but there are some common themes. Here are some of the most frequent issues that prevent audio plugins from working on Power ISA systems:

1. Architecture-specific compiler options

This is the easiest issue to fix. Some projects pass architecture-specific options to the compiler (like -msse on x86) but don’t restrict those options only to the relevant architectures. In that case the fix is simply to perform an architecture check before applying those options, as in this change to tap-lv2:

@@ -10 +10,3 @@
+ifneq ($(findstring $(shell uname -m),x86_64 amd64 i386 i686),)
 CFLAGS += -mtune=generic -msse -msse2 -mfpmath=sse
+endif

Some projects handle specific architectures but then assume x86 as a fallback. In that case, it’s easiest to continue the pattern and add ppc as one of the cases, as in this change to Helm:

@@ -25,4 +25,8 @@ ifneq (,$(findstring aarch,$(MACHINE)))
 ifneq (,$(findstring arm,$(MACHINE)))
	SIMDFLAGS := -march=armv8-a -mtune=cortex-a53 -mfpu=neon-fp-armv8 -mfloat-abi=hard
+else
+ifneq (,$(findstring ppc,$(MACHINE)))
+	SIMDFLAGS :=
 else
	SIMDFLAGS := -msse2
+endif

2. Assumptions broken by Power

Some code appears to be cross-platform but contains assumptions that don’t apply to Power ISA systems. For example, the plugin framework DPF used the first part of gcc -dumpmachine to obtain the correct directory name for VST plugins. On 64-bit little-endian PowerPC, that yields powerpc64le. But the VST 3 specification says the directory name should match uname -m, which is ppc64le. These two identifiers happen to be identical for many architectures, but generally differ on PowerPC and Power ISA systems.

Simply using uname -m here would break cross-compilation, so the fix performs a text substitution to correct the discrepancy on Power:

@@ -691 +691,3 @@ ifeq ($(LINUX),true)
-VST3_BINARY_DIR = Contents/$(TARGET_PROCESSOR)-linux
+# This must match `uname -m`, which differs from `gcc -dumpmachine` on PowerPC.
+VST3_ARCHITECTURE := $(patsubst powerpc%,ppc%,$(TARGET_PROCESSOR))
+VST3_BINARY_DIR = Contents/$(VST3_ARCHITECTURE)-linux

A similar issue existed in JUCE, a popular framework for audio plugins. JUCE performs architecture detection with a chain of preprocessor conditionals that (somewhat hackily) use #error to emit the correct architecture identifier, which JUCE uses in places that expect the output of uname -m. While those conditionals do attempt to detect PowerPC, they don’t account for the fact that the different endiannesses of 64-bit PowerPC have different identifiers, and incorrectly classify ppc64le as ppc64. This can cause compilation to fail entirely when JUCE uses the incorrect architecture name but then runs a validator that expects the correct one.

A simple endianness check fixes this one:

@@ -64,3 +64,7 @@ #elif defined(__ppc__) || defined(__ppc) || ...
   #if defined(__ppc64__) || defined(__powerpc64__) || defined(__64BIT__)
-    #error JUCE_ARCH ppc64
+    #ifdef __LITTLE_ENDIAN__
+      #error JUCE_ARCH ppc64le
+    #else
+      #error JUCE_ARCH ppc64
+    #endif
   #else

3. Lack of inclusion in platform-specific code

Some projects contain truly platform-specific code that needs to be written separately for each architecture, but don’t include PowerPC one of the handled cases. In these situations, the most straightforward (and sometimes only) fix is to add the necessary Power-specific code. For example, sfizz contained a copy of a low-level dependency that only supported x86 and ARM, but because that dependency had already added support for Power and other architectures upstream, all that was necessary for sfizz was to update the dependency.

Another example comes from DISTRHO Ports, a large collection of plugins ported to GNU/Linux, in which an architecture detection script required adding code to detect the various types of PowerPC:

@@ -42,4 +42,19 @@
     elif echo "${fileout}" | grep -q "x86-64"; then
         if [ "$(uname -m)" != "x86_64" ]; then
             MESON_EXE_WRAPPER="qemu-x86_64-static"
         fi
+
+    elif echo "${fileout}" | grep -q "64-bit LSB.*PowerPC"; then
+        if [ "$(uname -m)" != "ppc64le" ]; then
+            MESON_EXE_WRAPPER="qemu-ppc64le-static"
+        fi
+
+    elif echo "${fileout}" | grep -q "64-bit MSB.*PowerPC"; then
+        if [ "$(uname -m)" != "ppc64" ]; then
+            MESON_EXE_WRAPPER="qemu-ppc64-static"
+        fi
...

Although the purpose of this script is to aid cross-compilation, the lack of Power support prevented even native compilation.

4. Missing optional vector intrinsics

Hardware-specific vector intrinsic functions (SIMD) are often used to improve performance, but software that uses them must provide a separate implementation for each supported architecture, which rarely includes Power. However, some software is designed to use vector intrinsics only when such an implementation exists, falling back to a non-optimized, cross-platform approach otherwise. In practice, though, if non-optimized platforms don’t get much testing, bugs that unintentionally prevent compilation on these systems can go unnoticed.

This issue occurred in a dependency used by Wavetable that contained, but did not require, optimized SIMD code for x86–64 and ARM, but caused errors on other architectures by mistakenly trying to use their nonexistent SIMD implementations. Because SIMD was designed to be optional in this dependency, the fix simply adds an architecture check:

@@ -80,5 +80,7 @@
  #ifdef JUCE_32BIT
   #define GIN_HAS_SIMD 0
- #else
+ #elif defined(JUCE_INTEL) || defined(JUCE_ARM)
   #define GIN_HAS_SIMD 1
+ #else
+  #define GIN_HAS_SIMD 0
  #endif

Another example comes, again, from JUCE. JUCE contains a copy of libpng, a PNG library that actually contains optimized VSX code for Power! But in a cruel twist of irony, JUCE excluded that optimized code from their copy, yet kept the code that tries to use it. The result? Linker failures that, because they occur in a helper tool almost always compiled early in the build process, prevent almost all software that uses JUCE from compiling on Power.

The fix for this one comes from libpng itself, which demonstrates the proper way of disabling optimizations, by defining certain macros instead of just deleting the implementations. So these macros simply need to be added to JUCE, which… already defines one of them?

#define PNG_ARM_NEON_OPT 0

Indeed, because JUCE’s copy of libpng excludes the optimized routines for all architectures, this issue presumably appeared on ARM at some point and was fixed (x86 happens not to exhibit the problem because libpng optimizations are opt-in on that architecture). That would have been a great time to include the other macros to disable optimizations, but instead, that task is accomplished by this fix:

@@ -268 +268,4 @@
   #define PNG_ARM_NEON_OPT 0
+  #define PNG_POWERPC_VSX_OPT 0
+  #define PNG_INTEL_SSE_OPT 0
+  #define PNG_MIPS_MSA_OPT 0

5. Missing required vector intrinsics

Finally, one of the most common sources of incompatibility with Power in audio software is the non-optional use of SIMD, necessitating separate implementations for each architecture. Unsurprisingly, support for Power is not typically included.

My preferred approach in this case is to use SIMDe, a cross-platform implementation of x86 (and ARM) vector intrinsics, optimized with the platform’s native vector operations when available. In the case of Vaporizer2, that looks something like this:

@@ -6,5 +6,8 @@
 #ifdef __aarch64__ //arm64
	#include "../../sse2neon.h"
-#else
+#elif defined JUCE_INTEL
	#include "immintrin.h"
+#else
+	#define SIMDE_ENABLE_NATIVE_ALIASES
+	#include <simde/x86/sse3.h>
 #endif

Something similar is done for Vitalium, the fork of Vital in DISTRHO Ports. However, when attempting to use SIMDe to implement the x86 intrinsics it uses, I encountered odd runtime errors with backtraces that involved SIMDe. More investigation is needed to determine the cause, but because Vitalium also provides implementations of its vector-optimized routines for ARM, an easier workaround was to configure SIMDe to implement ARM’s vector intrinsics (NEON) instead, which appears to avoid that issue.

@@ -33,8 +33,13 @@
 #else
-  static_assert(false, "No SIMD Intrinsics found which are necessary for compilation");
+  #warning "No native SIMD support; using SIMDe"
+  #define SIMDE_ENABLE_NATIVE_ALIASES
+  #include <simde/arm/neon.h>
+  #define VITAL_NEON 1
+  #define VITAL_SIMDE 1
 #endif

-#if VITAL_SSE2
+#if VITAL_SIMDE
+#elif VITAL_SSE2
   #include <immintrin.h>
 #elif VITAL_NEON
   #include <arm_neon.h>

[See also x86intrin.h -- Ed.]

Lastly, another example of this comes—yet again—from JUCE, in a form that prompts a different solution. JUCE contains its own set of SIMD functions, designed with an architecture-independent API but requiring architecture-specific implementations. Although Power is predictably unsupported, JUCE does contain cross-platform fallback implementations for almost all of its SIMD API; however, these are used only as part of the architecture-specific implementations, to implement operations that don’t have a precise native equivalent on a given platform.

Why not, then, provide a universal fallback implementation for all unsupported architectures? That’s exactly what this change does (with a diff too large to include here), fixing another source of compilation errors in projects that use JUCE.

Conclusion

When ARM devices running desktop operating systems started to become more common, in particular due to Apple’s decision to move away from Intel, I had hoped that this would encourage x86-only software to become architecture-independent, benefiting Power in the process. Unfortunately, many projects have simply special-cased support for ARM, even when cross-platform alternatives exist (which don’t preclude optimized architecture-specific routines if desired). Maybe another architecture will eventually become the catalyst for this change, but until then, we’ll need patches.

I will, however, take a moment to complain again about JUCE. Despite making me sign their Contribution License Agreement, JUCE has ignored all of my pull requests (juce-pr1, juce-pr2, and juce-pr3), even the simplest two that would not require much review (yet are the most important fixes for Power). Because of this, every project that uses JUCE must, at a minimum, use a patched version in order to compile on Power.

Still, I think the biggest takeaway is that music production is absolutely possible on Power. The experience is undoubtedly rough around the edges, but I hope that PowerAudio’s ports can reduce the discrepancy in available tools compared to other architectures, and generally make this activity more accessible to Power ISA users.

Baseline JIT patches available for Firefox ESR128 on OpenPOWER


It's been a long hot summer at $DAYJOB and I haven't had much time for much of anything, but I got granted some time this week to take care of an unrelated issue and seized the opportunity to get caught up.

The OpenPOWER Firefox JIT still crashes badly in Wasm and Ion for reasons I have yet to ascertain, but the Baseline Interpreter and Baseline Compiler stages of the JIT continue to work great and are significantly faster than the baseline Interpreter (even in a PGO-LTO build), so I did the needful and finally got them pulled up to the new Extended Support Release which is Firefox 128.

I then spent the last two days bashing out crashes and bugs, including a regression from Firefox's new WebAssembly-based in-browser translation engine. The browser chrome now assumes that WebAssembly is always present, but on JIT-less tier-3 machines (or partially implemented JITs like ours, and possibly where Wasm is disabled in prefs) it isn't, so it hits an uncaught error which then blows up substantial portions of the browser UI like the stop-reload button and context menus. The Fedora official ppc64le build of Firefox 128.0.3 is affected as well; I filed bug 1912623 with a provisional fix. Separately all JIT and JavaScript tests completely pass in multiple permutations of Baseline Interpreter and Baseline Compiler, single- and multi-threaded.

As a sign of confidence I've been dogfooding it for the last 24 hours with my typical massive number of tabs and add-ons and can't get it to crash anymore, so I'm typing this blog post in it and using it to upload its own changesets to Github. Grab the ESR source from Mozilla (either pull a tree with Mercurial or just download an archive) and apply the changesets in numerical order, though after bug 1912623 is fixed you won't need #823094. The necessary .mozconfig for building an LTO-PGO build, which is what I'm using, is also in that issue; it's pretty much the same as earlier ones except for --enable-jit.

Little-endian POWER9 remains the officially supported architecture. This version has not been tested on POWER8 or big-endian POWER9, though the JIT should still statically disable itself even if compiled with it on, so the browser should still otherwise work normally. If this is not the case, I consider that a bug, and will accept a fix (I don't have a POWER8 system here to test against). There are no Power10 specific instructions, but I don't see any reason why it wouldn't work on a Power10 machine or on a SolidSilicon S1 whenever we get one of those.

Comments always solicited, though backtraces and reliable STRs are needed to diagnose any bug, of course. Meanwhile I've got more work cut out for me but at least we're back in the saddle for another go.

Chromium Power ISA patches ... from Solid Silicon


It appears that some of the issues observed by me and others with Chromium on Fedora ppc64le may in fact be due to an incomplete patch set, which is now available on Solid Silicon's Gitlab. If your distro doesn't support this, now you have an upstream to point them at or build your own. They include the Ungoogled changes as well, even though I retain my philosophical objections to Chromium, and still use Firefox personally (I've got to get back on the horse and resume maintaining my personal builds now that I've got Plasma 6 back running on Xorg again).

Oh, yeah, it really is that Solid Silicon. You can make your own speculations from the commit log, though regardless of whether Solid Silicon is truly a separate concern or a Raptor subsidiary, it wouldn't be surprising that Raptor resources are assisting since they've kind of bet the store on the S1.

Timothy Pearson's comments in the Electron Github suggest that Google has been pretty resistant to incorporating support for architectures outside of their core platforms. This is not a wholly unreasonable position on Google's part but it's not a particularly charitable one, and unlike Mozilla, the Chrome team doesn't really have the concept of a tier-3 build nor any motivation to. That kind of behaviour is all the more reason not to encourage browser monocultures because it's not just the layout engine that causes vendor lock-in. Fortunately V8, the JavaScript engine, is maintained separately, and reportedly has been more accommodating presumably because of things like Node.js on IBM hardware (even IBM i via PASE!).

Mozilla is much more accepting of this as long as regressions aren't introduced. This is why TenFourFox patches were largely not upstreamed since they would potentially cause problems with Cocoa widgets in later versions of macOS, though what patches were generally applicable I would do so. The main reason I'm still maintaining the Firefox ppc64le JIT patches outside is because I still can't solve these recent startup crashes deep within Wasm code, which largely limits me to Baseline Compiler and thus is not suitable for loading into the tree yet (we'd have to also upstream pref changes that would adversely affect tier-1 until this is fixed). I still intend to pull these patches up to the next ESR, especially since Github is glacially slow now without a JIT and it's affecting my personal ability to do other tasks. Maybe I should be working on something like rr for ppc64le at the same time because stepping through deeply layered code in gdb is a great way to go stark raving mad.