<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2025-06-30T12:50:18+00:00</updated><id>/feed.xml</id><title type="html">bjorn3</title><entry><title type="html">Progress report on rustc_codegen_cranelift (June 2025)</title><link href="/2025/06/30/progress-report-june-2025.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (June 2025)" /><published>2025-06-30T00:00:00+00:00</published><updated>2025-06-30T00:00:00+00:00</updated><id>/2025/06/30/progress-report-june-2025</id><content type="html" xml:base="/2025/06/30/progress-report-june-2025.html"><![CDATA[<p>There has been a fair bit of progress since the <a href="https://bjorn3.github.io/2024/11/14/progress-report-nov-2024.html">last progress report</a>! There have been <a href="https://github.com/rust-lang/rustc_codegen_cranelift/compare/0b8e94eb69e0901b42e91c3b713207b33f4e46b2...c713ffab3c6e28ab4b4dd4e392330f786ea657ad">476 commits</a> since the last progress report.</p>

<p>You can find a precompiled version of cg_clif at <a href="https://github.com/rust-lang/rustc_codegen_cranelift/releases/tag/dev">https://github.com/rust-lang/rustc_codegen_cranelift/releases/tag/dev</a> or in the rustc-codegen-cranelift-preview rustup component if you want to try it out.</p>

<h1 id="achievements-in-the-past-7-months">Achievements in the past 7 months</h1>

<h4 id="unwinding">Unwinding</h4>

<p>Cranelift has finally implemented support for cleanup during stack unwinding on Linux.</p>

<p>A little bit of history: As part of my bachelor thesis I finished a little under a year ago, I implemented support for unwinding in Cranelift. This was mostly working, however when I revisited the code after finishing writing of my thesis to get it upstreamed, I discovered that there were some cases where the register allocator would insert moves after a call instruction that can unwind and then expect these moves to be executed before jumping to any of the successors of the call instruction. This however can’t happen when unwinding as unwinding directly jumps from the unwinding call to the exception handler block. I tried a bit to fix this, but got stuck on limitations in Cranelift’s register allocator. In addition I got busy with my day job. Fast forward to about two months ago, when Chris Fallin (the main author of major parts of Cranelift) started implementing support for exception handling in Cranelift, fixing the limitations of the register allocator that I got stuck on in the process. The overall design is similar to my proposal, though the details of the Cranelift IR extensions are more elegant than what I previously came up with. I was able to rebase the cg_clif changes from my thesis on top of the newly landed Cranelift changes with minor effort after a couple of small fixes on the Cranelift side. Thanks a lot for working on unwinding support for Cranelift, Chris!</p>

<p>A walkthrough of how unwinding is actually implemented in cg_clif can be found at <a href="https://tweedegolf.nl/en/blog/157/exception-handling-in-rustc-codegen-cranelift">https://tweedegolf.nl/en/blog/157/exception-handling-in-rustc-codegen-cranelift</a>.</p>

<p>Unwinding support in cg_clif will remain disabled by default for now pending investigation of some build performance issues. In addition it currently doesn’t work on Windows and macOS. On macOS there are some minor differences around the exact encoding of the unwinding tables that haven’t been implemented yet. On Windows adding support will be a fair bit more complicated however. Windows uses the funclets based SEH rather than the landingpads based itanium unwinding (<code class="language-plaintext highlighter-rouge">.eh_frame</code>) for unwinding. Cranelift only supports landingpads.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
  <li><a href="https://github.com/bytecodealliance/rfcs/pull/36">bytecodealliance/rfcs#36</a>: Implementing the exception handling proposal in Wasmtime</li>
  <li>issue <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues/1567">#1567</a>: Support unwinding on panics</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10485">wasmtime#10485</a>: Cranelift: remove block params on critical-edge blocks. (thanks @cfallin!)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10502">wasmtime#10502</a>: Cranelift: remove return-value instructions after calls at callsites. (thanks @cfallin!)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10510">wasmtime#10510</a>: Cranelift: initial try_call / try_call_indirect (exception) support. (thanks @cfallin!)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10593">wasmtime#10593</a>: Some fixes for try_call</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10609">wasmtime#10609</a>: Cranelift: move exception-handler metadata into callsites. (by me and @cfallin)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10702">wasmtime#10702</a>: Avoid clobbering all float registers in the presence of try_call on arm64</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10709">wasmtime#10709</a>: Cranelift: fix invalid regalloc constraints on try-call with empty handler list. (thanks @cfallin!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/ab514c95967a7c5d732aa1e3800afc4d9cb252f9">ab514c9</a>: Pass UnwindAction to a couple of functions</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/9495eb517e5a2b76fcdb514eeec5aa4d8fd16320">9495eb5</a>: Pass Module to UnwindContext</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1575">#1575</a>: Preparations for exception handling support</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1584">#1584</a>: Experimental exception handling support on Linux</li>
</ul>

<h4 id="arm">ARM</h4>

<p>CI now builds and tests on native arm64 Linux systems rather than testing a subset of the tests in QEMU. Inline asm on arm64 can now use vector registers. And the half and bytecount crates are now fixed on arm64.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1557">#1557</a>: Test and dist for arm64 linux on CI</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1564">#1564</a>: Fix usage of vector registers in inline asm on arm64</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1566">#1566</a>: Fix the half and bytecount crates on arm64</li>
</ul>

<h4 id="f16f128-support">f16/f128 support</h4>

<p>@beetrees contributed support for the unstable f16 and f128 types.</p>

<ul>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/8860">wasmtime#8860</a>: Initial f16 and f128 support (thanks @beetrees!)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9045">wasmtime#9045</a>: Add initial f16 and f128 support to the x64 backend (thanks @beetrees!)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9076">wasmtime#9076</a>: Add initial f16 and f128 support to the aarch64 backend (thanks @beetrees!)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10652">wasmtime#10652</a>: Add inital support for f16 without Zfh and f128 to the riscv64 backend (thanks @beetrees!)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/10691">wasmtime#10691</a>: Add initial f16 and f128 support to the s390x backend (thanks @beetrees!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1574">#1574</a>: Add f16/f128 support (thanks @beetrees!)</li>
</ul>

<h4 id="sharing-code-between-codegen-backends">Sharing code between codegen backends</h4>

<p>I’ve made two PR’s to rustc to share more code between codegen backends. This reduces the maintenance burden of both cg_clif and rustc. In the future I would like to migrate the entire inline asm handling of cg_clif to cg_ssa to be used as fallback for codegen backends that don’t natively support inline asm.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rust/pull/132820">rust#132820</a>: Add a default implementation for CodegenBackend::link</li>
  <li><a href="https://github.com/rust-lang/rust/pull/134232">rust#134232</a>: Share the naked asm impl between cg_ssa and cg_clif</li>
  <li><a href="https://github.com/rust-lang/rust/pull/141769">rust#141769</a>: Move metadata object generation for dylibs to the linker code</li>
</ul>

<h4 id="simd">SIMD</h4>

<p>Some new vendor intrinsics were implemented.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/b004312ee4c8418e5a42cc25b971fa5fc5ac88b7">b004312</a>: Implement arm64 vaddlvq_u8 and vld1q_u8_x4 vendor intrinsics</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/1afce7c3548ff31174cb060f3217b1994d982bed">1afce7c</a>: Implement simd_insert_dyn and simd_extract_dyn intrinsics</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/49bfa1aaf5f7e68079e6ed9b0d23dacebf38bac9">49bfa1a</a>: Fix simd_insert_dyn and simd_extract_dyn intrinsics with non-pointer sized indices</li>
</ul>

<h4 id="abi">ABI</h4>

<p>ABI handling for 128bit integers libcalls has been improved. In addition the abi-cafe version we test against has been updated to 1.0. Thanks to a bunch of new features it has, we no longer need to patch it’s source code, making it easier to do future updates.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1546">#1546</a>: Fix the ABI for libcalls</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1582">#1582</a>: Update to abi-cafe 1.0</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/b7cfe2f4db9e7740f6302a1627d1c087054e64b4">b7cfe2f</a>: Use the new –debug flag of abi-cafe</li>
</ul>

<h1 id="challenges">Challenges</h1>

<h4 id="simd-1">SIMD</h4>

<p>While <code class="language-plaintext highlighter-rouge">core::simd</code> is fully supported through emulation using scalar operations, many platform specific vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code> are not supported. This has been improving though with the most important x86_64 and arm64 vendor intrinsics implemented.</p>

<p>If your program uses any unsupported vendor intrinsics you will get a compile time warning and if it actually gets reached, the program will abort with an error message indicating which intrinsic is unimplemented. Please open an issue if this happens.</p>

<ul>
  <li>issue <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="abi-1">ABI</h4>

<p>There are still several remaining ABI compatibility issues with LLVM. On arm64 Linux there is a minor incompatibility with the C ABI, but the Rust ABI works just fine. On arm64 macOS there are several ABI incompatibilities that affect the Rust ABI too, so mixing cg_clif and cg_llvm there isn’t recommended yet. And on x86_64 Windows there is also an incompatibility around return values involving i128. I’m slowly working on fixing these.</p>

<ul>
  <li>issue <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues/1525">#1525</a>: Tracking issue for abi-cafe failures</li>
</ul>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[There has been a fair bit of progress since the last progress report! There have been 476 commits since the last progress report.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (November 2024)</title><link href="/2024/11/14/progress-report-nov-2024.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (November 2024)" /><published>2024-11-14T00:00:00+00:00</published><updated>2024-11-14T00:00:00+00:00</updated><id>/2024/11/14/progress-report-nov-2024</id><content type="html" xml:base="/2024/11/14/progress-report-nov-2024.html"><![CDATA[<p>There has been a fair bit of progress since the <a href="https://bjorn3.github.io/2024/04/06/progress-report-april-2024.html">last progress report</a>! There have been <a href="https://github.com/rust-lang/rustc_codegen_cranelift/compare/242b261585ffb70108bfd236a260e95ec4b06556...0b8e94eb69e0901b42e91c3b713207b33f4e46b2">383 commits</a> since the last progress report.</p>

<p>You can find a precompiled version of cg_clif at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/releases/tag/dev">https://github.com/bjorn3/rustc_codegen_cranelift/releases/tag/dev</a> or in the rustc-codegen-cranelift-preview rustup component if you want to try it out.</p>

<h1 id="achievements-in-the-past-eight-months">Achievements in the past eight months</h1>

<h4 id="abi">ABI</h4>

<p>There have been significant improvements in the ABI compatibility between the Cranelift and LLVM backends. Most of these improvements affect the Rust ABI across all targets, but some only affect a single platform. In the latter case I will mention them under the section of the respective target. One of the improvements is a partial fix to the Rust ABI where rustc depended on LLVM inventing a calling convention when more values are returned than fit in the registers reserved for returning values by the native calling convention. It is hard for the Cranelift backend to match whatever calling convention was invented by LLVM. For the GCC backend, it is likely impossible to match the convention.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1523">#1523</a>: Update abi-cafe</li>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/9250">wasmtime#9250</a>: Cranelift: Incorrect abi for i128, i128 return value on x86_64 sysv</li>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/9509">wasmtime#9509</a>: Cranelift: Correctly handle abi calculation for multi-part arguments</li>
  <li><a href="https://github.com/rust-lang/rust/pull/131211">rust#131211</a>: Return values larger than 2 registers using a return area pointer</li>
  <li><a href="https://github.com/rust-lang/rust/pull/132729">rust#132729</a>: Make fn_abi_sanity_check a bit stricter</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/8875">wasmtime#8875</a>: Various cleanups to the ABI handling code</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/8903">wasmtime#8903</a>: Various cleanups to the ABI handling code (part 1)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9253">wasmtime#9253</a>: Couple of cleanups to the ABI computation</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9258">wasmtime#9258</a>: Remove StructArgument support from the arm64, riscv64 and s390x backends</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9267">wasmtime#9267</a>: Couple of improvements to the abi handling code (part 3)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9284">wasmtime#9284</a>: Couple of improvements to the abi handling code (part 4)</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9287">wasmtime#9287</a>: Make the Tail call conv follow the system call conv for the return area ptr</li>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/9510">wasmtime#9510</a>: Cranelift: Remove support for implicitly adding a return area pointer</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9511">wasmtime#9511</a>: Gate support for implicit return area pointers behind an option</li>
</ul>

<h4 id="windows">Windows</h4>

<p>raw-dylib support for Windows has been implemented by @dpaoliello and @ChrisDenton. This was the last blocker before distributing cg_clif as rustup component for Windows. Thanks a lot to both for all the work!</p>

<ul>
  <li>issue <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues/1345">#1345</a>: Implement raw-dylib for Windows</li>
  <li><a href="https://github.com/rust-lang/ar_archive_writer/pull/15">ar_archive_writer#15</a>: Add the ability to create PE import libraries (thanks @dpaoliello!)</li>
  <li><a href="https://github.com/rust-lang/ar_archive_writer/pull/17">ar_archive_writer#17</a>: Add support for creating archives with members from an import library (thanks @dpaoliello!)</li>
  <li><a href="https://github.com/rust-lang/ar_archive_writer/pull/23">ar_archive_writer#23</a>: Make the null import descriptor name unique to the import library (thanks @ChrisDenton!)</li>
  <li><a href="https://github.com/rust-lang/rust/pull/128206">rust#128206</a>: Make create_dll_import_lib easier to implement</li>
  <li><a href="https://github.com/rust-lang/rust/pull/129164">rust#129164</a>: Use ar_archive_writer for writing COFF import libs on all backends (thanks @ChrisDenton!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/322c2f6b1373a71e99e291f2be6f2c9b82890a02">322c2f6</a>: Sync ar_archive_writer to LLVM 18.1.3</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1524">#1524</a>: Add support for raw-dylib (thanks @dpaoliello!)</li>
  <li><a href="https://github.com/rust-lang/rust/pull/128939">rust#128939</a>: Distribute rustc_codegen_cranelift for Windows</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1537">#1537</a>: Don’t panic about debug info for Arm64 Windows unwind info (thanks @dpaoliello!)</li>
</ul>

<h4 id="macos">macOS</h4>

<p>Support for calling variadic functions has long been a blocker for arm64 macOS support. While Rust doesn’t support defining variadic functions, it does need to be able to call several variadic functions like <code class="language-plaintext highlighter-rouge">ioctl</code>. As Cranelift doesn’t have native variadic function support, I have been hacking in support in cg_clif by taking advantage of the fact that in most calling conventions variadic arguments are passed the exact same way as regular arguments, so I could cast the defined function signature of the callee to one which lists all variadic arguments as regular arguments. On arm64 Apple however decided to force all variadic arguments to be passed on the stack<sup id="fnref:apple-arm64-vararg" role="doc-noteref"><a href="#fn:apple-arm64-vararg" class="footnote" rel="footnote">1</a></sup>. As a concequence of this, the hack cg_clif used doesn’t work. A couple months back first contributor <a href="https://github.com/beetrees">@beetrees</a> opened a PR which adds another hack on top of the existing hack to add enough dummy arguments to force the actual variadic arguments to be passed on the stack as they should be. In the future I would like to add native support for variadic functions to Cranelift, but until then this hack unblocked support for arm64 macOS. It is now available as rustup component too.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1515">#1515</a>: enable abi-cafe tests on aarch64-apple-darwin (thanks @lqd!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1500">#1500</a>: Fix varargs support on aarch64-apple-darwin (thanks @beetrees!)</li>
  <li><a href="https://github.com/rust-lang/rust/pull/127177">rust#127177</a>: Distribute rustc_codegen_cranelift for arm64 macOS</li>
  <li><a href="https://github.com/gimli-rs/object/pull/702">object#702</a>: Reverse the order of emitting relocations on MachO</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/253436c04c87b7d8dfed2fb14e42a67427196bc1">253436c</a>: Better parsing of <code class="language-plaintext highlighter-rouge">#[section_name]</code> on Mach-O</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/f340c81caac9bca69fba16a9e6f7622fa099d20a">f340c81</a>: Statically enable a couple of target features always enabled on arm64 macOS</li>
</ul>

<h4 id="performance">Performance</h4>

<p>I recently ran the rustc-perf benchmark suite on cg_clif first the first time in a very long time. The results were pretty bad with many benchmarks showing significant regressions compared to cg_llvm. After comparing profiler output between cg_clif and cg_llvm, it became quite clear why the regressions happened: When nightly rustc switched to using lld by default on Linux, this was only done when using the LLVM backend. The reason for this was that Cranelift didn’t yet use TLSDESC on arm64 and lld only supports the TLSDESC thread local storage implementation. This was fixed later, but rustc was never changed to allow lld with cg_clif until I opened a PR a couple of days ago. The next nightly showed much better benchmark results with most benchmarks being 10-50% faster. A couple of secondary benchmarks still showed some non-trivial regressions, but all of them are pathological code. Still I did look further into the <a href="https://github.com/rust-lang/rustc-perf/blob/master/collector/compile-benchmarks/coercions/src/main.rs">coercions</a> benchmark. This showed significantly more time spent writing the object file than compiling to clif ir. Turns out I forgot to wrap the <code class="language-plaintext highlighter-rouge">File</code> to which the object file is written in a <code class="language-plaintext highlighter-rouge">BufWriter</code>, so it did a ton of tiny writes. Adding the <code class="language-plaintext highlighter-rouge">BufWriter</code> completely fixed the regressions on this benchmark.</p>

<p>Moral of the story: Benchmark more often.</p>

<p>In any case some work is being done on making it easier to do local benchmarks and in the future getting <a href="https://perf.rust-lang.org">https://perf.rust-lang.org</a> to routinely benchmark cg_clif and compare it against cg_llvm.</p>

<p>One of the GSoC projects was adding a faster register allocator to Cranelift. This was successfully done and @d-sonuga (who worked on this) has shown quite promising benchmark results on their blog. Cranelift only started to support selecting this new register allocator a couple of days ago, and it hasn’t made it to a stable release of Cranelift yet. Because of this I haven’t benchmarked it myself yet.</p>

<details>
<summary>Benchmark results. (warning: very long images)</summary>

Left: before these changes. Right: after these changes.

<div style="float:left;max-width:50%;"><img loading="lazy" alt="the wall time on the benchmarks for before these changes" src="/assets/images/progress-report-nov-2024-before.png" /></div>
<div style="float:right;max-width:50%;"><img loading="lazy" alt="the wall time on the benchmarks for after these changes" src="/assets/images/progress-report-nov-2024-after.png" /></div>

<div style="clear:both;"></div>
</details>

<p></p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1489">#1489</a>: Translate MIR to clif ir in parallel with parallel rustc</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1541">#1541</a>: Use a BufWriter in emit_module to reduce syscall overhead</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1542">#1542</a>: Disable clif ir verifier by default</li>
  <li><a href="https://github.com/rust-lang/rust/pull/132774">rust#132774</a>: Use lld with non-LLVM backends</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/7201">wasmtime#7201</a>: aarch64: Implement TLSDESC for TLS GD accesses (thanks @afonso360!)</li>
  <li><a href="https://d-sonuga.netlify.app/gsoc/regalloc-iii/">https://d-sonuga.netlify.app/gsoc/regalloc-iii/</a></li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/9611">wasmtime#9611</a>: Cranelift: add option to use new single-pass register allocator.</li>
</ul>

<h4 id="inline-assembly">Inline assembly</h4>

<p>A couple of improvements to inline assembly support this time.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1481">#1481</a>: Allow MaybeUninit in input and output of inline assembly (thanks @taiki-e!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/cba05a7a14b307d31b226a11c2104e53c2ae1291">cba05a7</a>: Support naked functions</li>
</ul>

<h4 id="simd">SIMD</h4>

<p>A whole bunch of new vendor intrinsics were implemented by contributors.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1488">#1488</a>: add the llvm.x86.sse42.crc32.32.32 intrinsic (thanks @folkertdev!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1490">#1490</a>: add all llvm.x86.sse42.crc32.<em>.</em> intrinsics (thanks @folkertdev!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1491">#1491</a>: add llvm.x86.avx2.permd intrinsic (thanks @folkertdev!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/8f1d41e2a0cf73f6ecb1737f0c70a07bc8989bfa">8f1d41e</a>: Implement _rdtsc x86 vendor intrinsic</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1495">#1495</a>: add llvm.x86.sse2.cvtps2dq (thanks @folkertdev!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/c48b010845213ba3be38ca4a481160ed582fac8a">c48b010</a>: Implement x86 _mm_sqrt_ss vendor intrinsic</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1533">#1533</a>: aarch64 neon intrinsics: vmaxq_f32, vminq_f32, vaddvq_f32, vrndnq_f32 (thanks @tjamaan)</li>
</ul>

<h1 id="challenges">Challenges</h1>

<h4 id="simd-1">SIMD</h4>

<p>While <code class="language-plaintext highlighter-rouge">core::simd</code> is fully supported through emulation using scalar operations, many platform specific vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code> are not supported. This has been improving though with the most important x86_64 and arm64 vendor intrinsics implemented.</p>

<p>If your program uses any unsupported vendor intrinsics you will get a compile time warning and if it actually gets reached, the program will abort with an error message indicating which intrinsic is unimplemented. Please open an issue if this happens.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding. I’m working on implementing this and integrating it with cg_clif.</p>

<p>Until this is fixed <code class="language-plaintext highlighter-rouge">panic::catch_unwind()</code> will not work and panicking in a single thread will abort the entire process just like <code class="language-plaintext highlighter-rouge">panic=abort</code> would. This also means you will have to use <code class="language-plaintext highlighter-rouge">-Zpanic-abort-tests</code> in combination with setting <code class="language-plaintext highlighter-rouge">panic = "abort"</code> if you want a test failure to not bring down the entire test harness.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h4 id="abi-1">ABI</h4>

<p>There are still several remaining ABI compatibility issues with LLVM. On arm64 Linux there is a minor incompatibility with the C ABI, but the Rust ABI works just fine. On arm64 macOS there are several ABI incompatibilities that affect the Rust ABI too, so mixing cg_clif and cg_llvm there isn’t recommended yet. And on x86_64 Windows there is also an incompatibility around return values involving i128. I’m slowly working on fixing these.</p>

<ul>
  <li>issue <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues/1525">#1525</a>: Tracking issue for abi-cafe failures</li>
</ul>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:apple-arm64-vararg" role="doc-endnote">
      <p>https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms#Update-code-that-passes-arguments-to-variadic-functions <a href="#fnref:apple-arm64-vararg" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[There has been a fair bit of progress since the last progress report! There have been 383 commits since the last progress report.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (April 2024)</title><link href="/2024/04/06/progress-report-april-2024.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (April 2024)" /><published>2024-04-06T00:00:00+00:00</published><updated>2024-04-06T00:00:00+00:00</updated><id>/2024/04/06/progress-report-april-2024</id><content type="html" xml:base="/2024/04/06/progress-report-april-2024.html"><![CDATA[<p>There has been a fair bit of progress since the <a href="https://bjorn3.github.io/2023/10/31/progress-report-oct-2023.html">last progress report</a>! There have been <a href="https://github.com/rust-lang/rustc_codegen_cranelift/compare/9a33f82140c6da6e5808253309c674554b93e9fe...242b261585ffb70108bfd236a260e95ec4b06556">342 commits</a> since the last progress report.</p>

<p>You can find a precompiled version of cg_clif at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/releases/tag/dev">https://github.com/bjorn3/rustc_codegen_cranelift/releases/tag/dev</a> or in the rustc-codegen-cranelift-preview rustup component if you want to try it out.</p>

<h1 id="achievements-in-the-past-five-months">Achievements in the past five months</h1>

<h4 id="simd">SIMD</h4>

<p>A ton of missing SIMD intrinsics got reported over the past couple of months. Most intrinsics that people have reported missing are now implemented.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1416">#1416</a>: Implement AArch64 intrinsics necessary for simd-json (thanks @afonso360!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1417">#1417</a>: Implement a lot of SIMD intrinsics</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1425">#1425</a>: Implement AES-NI and SHA256 crypto intrinsics using inline asm</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1431">#1431</a>: Implement another batch of vendor intrinsics</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/45d8c121ba02c825379b655d8dd74e1843e98d62">45d8c12</a>: Return architecturally mandated target features to rustc</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1443">#1443</a>: Restructure x86 signed pack instructions (thanks @Noratrieb!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/0dc13d7acb0118d6c14a9209d921e5278e829458">0dc13d7</a>: Implement _mm_prefetch as nop</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/24361a1b99b122806afdc01c3aae1c43fdcc7e0a">24361a1</a>: Fix portable-simd tests</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/604c8a7cf80eca33bd078d6b45faaa808ef9ecd8">604c8a7</a>: Accept [u8; N] bitmasks in simd_select_bitmask</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/cdae185e3022b6e7c6c7fe363353fe1176a06604">cdae185</a>: Implement SHA-1 x86 vendor intrinsics</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/1ace86eb0be64a57e5df7f37e17b3cf5f414943d">1ace86e</a>: Implement all x86 vendor intrinsics used by glam</li>
</ul>

<h4 id="debuginfo">Debuginfo</h4>

<p>I’ve started implementing debuginfo support beyond the already existing line table support. Most primitive types are now described in the debuginfo tables. And the locations and types of statics are now encoded. For unsupported types <code class="language-plaintext highlighter-rouge">[u8; size_of::&lt;T&gt;()]</code> will be used as type instead. While debuginfo for statics may not be all that useful for most use cases, describing types is a prerequisite for debuginfo describing the locations of locals, which is very useful for debugging.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1470">#1470</a>: Various small debuginfo improvements</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1472">#1472</a>: Add debuginfo for statics</li>
  <li>issue <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues/166">#166</a>: DWARF support</li>
</ul>

<h4 id="s390x-support">s390x support</h4>

<p>A couple of fixes to the s390x support now allows compiling and testing cg_clif on s390x.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1457">#1457</a>: Fix simd_select_bitmask on big-endian systems (thanks @uweigand!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1458">#1458</a>: Fix download hash check on big-endian systems (thanks @uweigand!)</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/b03b41420b2dc900a9db019f4b5a5c22c05d2bb8">b03b414</a>: Fix stack alignment problem on s390x</li>
</ul>

<h1 id="challenges">Challenges</h1>

<h4 id="simd-1">SIMD</h4>

<p>While <code class="language-plaintext highlighter-rouge">core::simd</code> is fully supported through emulation using scalar operations, many platform specific vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code> are not supported. This has been improving though with the most important x86_64 and arm64 vendor intrinsics implemented.</p>

<p>If your program uses any unsupported vendor intrinsics you will get a compile time warning and if it actually gets reached, the program will abort with an error message indicating which intrinsic is unimplemented. Please open an issue if this happens.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding. I’m working on implementing this and integrating it with cg_clif.</p>

<p>Until this is fixed <code class="language-plaintext highlighter-rouge">panic::catch_unwind()</code> will not work and panicking in a single thread will abort the entire process just like <code class="language-plaintext highlighter-rouge">panic=abort</code> would. This also means you will have to use <code class="language-plaintext highlighter-rouge">-Zpanic-abort-tests</code> in combination with setting <code class="language-plaintext highlighter-rouge">panic = "abort"</code> if you want a test failure to not bring down the entire test harness.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[There has been a fair bit of progress since the last progress report! There have been 342 commits since the last progress report.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (Oct 2023)</title><link href="/2023/10/31/progress-report-oct-2023.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (Oct 2023)" /><published>2023-10-31T00:00:00+00:00</published><updated>2023-10-31T00:00:00+00:00</updated><id>/2023/10/31/progress-report-oct-2023</id><content type="html" xml:base="/2023/10/31/progress-report-oct-2023.html"><![CDATA[<p>Quite some exciting progress since the <a href="https://bjorn3.github.io/2023/07/29/progress-report-july-2023.html">last progress report</a>! There have been <a href="https://github.com/rust-lang/rustc_codegen_cranelift/compare/6641b3a548a425eae518b675e43b986094daf609...9a33f82140c6da6e5808253309c674554b93e9fe">180 commits</a> since the last progress report.</p>

<p>As of today, rustc_codegen_cranelift is available on nightly! :tada: You can run <code class="language-plaintext highlighter-rouge">rustup component add rustc-codegen-cranelift-preview --toolchain nightly</code> to install it and then either <code class="language-plaintext highlighter-rouge">CARGO_PROFILE_DEV_CODEGEN_BACKEND=cranelift cargo +nightly build -Zcodegen-backend</code> to use it for the current invocation or add</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">[unstable]</span>
<span class="py">codegen-backend</span> <span class="p">=</span> <span class="kc">true</span>

<span class="nn">[profile.dev]</span>
<span class="py">codegen-backend</span> <span class="p">=</span> <span class="s">"cranelift"</span>
</code></pre></div></div>

<p>to <code class="language-plaintext highlighter-rouge">.cargo/config.toml</code> or</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># This line needs to come before anything else in Cargo.toml</span>
<span class="py">cargo-features</span> <span class="p">=</span> <span class="nn">["codegen-backend"]</span>

<span class="nn">[profile.dev]</span>
<span class="py">codegen-backend</span> <span class="p">=</span> <span class="s">"cranelift"</span>
</code></pre></div></div>

<p>to <code class="language-plaintext highlighter-rouge">Cargo.toml</code> to enable it by default for debug builds. You can also set <code class="language-plaintext highlighter-rouge">codegen-backend</code> for individual packages using <code class="language-plaintext highlighter-rouge">[profile.dev.package.my_program] codegen-backend = "cranelift"</code>. This would for example allow building a game engine using LLVM all optimizations enabled, but your game logic using Cranelift for faster iteration.</p>

<p>The following targets are currently supported:</p>

<ul>
  <li>x86_64-unknown-linux-gnu</li>
  <li>x86_64-unknown-linux-musl</li>
  <li>x86_64-apple-darwin</li>
  <li>aarch64-unknown-linux-gnu</li>
  <li>aarch64-unknown-linux-musl</li>
</ul>

<p>Windows support has been omitted for now. And for macOS currently on supports x86_64 as Apple invented their own calling convention for arm64 for which variadic functions can’t easily be implemented as hack. If you are using an M1 processor, you could try installing the x86_64 version of rustc and then using Rosetta 2. Rosetta 2 will hurt performance though, so you will need to try if it is faster than the LLVM backend with arm64 rustc.</p>

<p>Also be aware that there are currently still some <a href="#challenges">missing features</a>.</p>

<h1 id="achievements-in-the-past-three-months">Achievements in the past three months</h1>

<h4 id="distributing-as-rustup-component">Distributing as rustup component</h4>

<p>As I already indicated at the start of this progress report, cg_clif is now available as rustup component.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rust/pull/81746">rust-lang/rust#81746</a>: Distribute cg_clif as rustup component on the nightly channel</li>
</ul>

<h4 id="moved-to-the-rust-lang-org">Moved to the rust-lang org</h4>

<p>Rustc_codegen_cranelift is now part of the rust-lang github organization: <a href="https://github.com/rust-lang/rustc_codegen_cranelift/">https://github.com/rust-lang/rustc_codegen_cranelift/</a></p>

<h4 id="risc-v-support">Risc-V support</h4>

<p>While Cranelift has had a riscv64 backend for a couple of months now, only recently some of the features have been implemented as well as some bug fixes have been done by @afonso360 to make cg_clif work on linux riscv64gc. Once that was done I only needed to add inline assembly support for riscv64.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1398">#1398</a>: Add riscv64 linux support</li>
</ul>

<h4 id="simd">SIMD</h4>

<p>A whole bunch more x86_64 and arm64 vendor intrinsics have been implemented. This includes arm64 vendor intrinsics used by newer regex versions and the x86_64 vendor intrinsics used by rav1e and image. In addition a bunch of the new platform independent simd intrinsics used by <code class="language-plaintext highlighter-rouge">std::simd</code> have been implemented. The hack to disable detection of target features using <code class="language-plaintext highlighter-rouge">is_x86_feature_detected!()</code> has now been removed when inline asm support is enabled. This hack never worked when using a standard library compiled by LLVM anyway and enough vendor intrinsics are now supported to not need it anymore most of the time.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/c974bc89b874fa5a46dfb2db8e983d4b864e42c5">c974bc8</a>: Update regex and implement necessary AArch64 vendor intrinsics</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/f1ede97b145c084b14579c467c4276d247193adf">f1ede97</a>: Update portable-simd test and implement new simd_* platform intrinsics</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/e5ba1e84171899aa99b4ba6c1b5d4eef3873592a">e5ba1e8</a>: Implement llvm intrinsics necessary for rav1e</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/commit/a558968dbe962b1daa730426d001becebd102931">a558968</a>: Implement all llvm intrinsics necessary for the image crate</li>
</ul>

<h4 id="inline-assembly">Inline assembly</h4>

<p>Inline assembly is now supported on arm64 and riscv64 as well as macOS and Windows. Futhermore support for inline assembly is now being tested in cg_clif’s CI. With the exception of <code class="language-plaintext highlighter-rouge">sym</code> operands, inline assembly is now declared as stable for usage with cg_clif.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1396">#1396</a>: Support inline asm on AArch64</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1397">#1397</a>: Test inline asm support on CI</li>
  <li>issue <a href="https://github.com/rust-lang/rustc_codegen_cranelift/issues/1204">#1204</a>: Full asm!() support</li>
  <li><a href="https://github.com/rust-lang/rustc_codegen_cranelift/pull/1403">#1403</a>: Support and stabilize inline asm on all platforms</li>
</ul>

<h1 id="challenges">Challenges</h1>

<h4 id="simd-1">SIMD</h4>

<p>While <code class="language-plaintext highlighter-rouge">core::simd</code> is fully supported through emulation using scalar operations, many platform specific vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code> are not supported. This has been improving though with the most important x86_64 and arm64 vendor intrinsics implemented.</p>

<p>If your program uses any unsupported vendor intrinsics you will get a compile time warning and if it actually gets reached, the program will abort with an error message indicating which intrinsic is unimplemented. Please open an issue if this happens.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding. I’m working on implementing this and integrating it with cg_clif.</p>

<p>Until this is fixed <code class="language-plaintext highlighter-rouge">panic::catch_unwind()</code> will not work and panicking in a single thread will abort the entire process just like <code class="language-plaintext highlighter-rouge">panic=abort</code> would. This also means you will have to use <code class="language-plaintext highlighter-rouge">-Zpanic-abort-tests</code> in combination with setting <code class="language-plaintext highlighter-rouge">panic = "abort"</code> if you want a test failure to not bring down the entire test harness.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[Quite some exciting progress since the last progress report! There have been 180 commits since the last progress report.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (July 2023)</title><link href="/2023/07/29/progress-report-july-2023.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (July 2023)" /><published>2023-07-29T00:00:00+00:00</published><updated>2023-07-29T00:00:00+00:00</updated><id>/2023/07/29/progress-report-july-2023</id><content type="html" xml:base="/2023/07/29/progress-report-july-2023.html"><![CDATA[<p>It has been quite a while since the <a href="https://bjorn3.github.io/2022/10/12/progress-report-okt-2022.html">last progress report</a>. A ton of progress has been made since then, but I simply didn’t get around writing a new progress report. There have been <a href="https://github.com/bjorn3/rustc_codegen_cranelift/compare/69297f9c863f0e153d10447685b9a2cc34f60d57...6641b3a548a425eae518b675e43b986094daf609">639 commits</a> since the last progress report. This is significantly more than the last time given how long there has been since the last progress report. As such I skimmed the commit list to see what stood out to me. I may have missed some important things.</p>

<p>You can find a precompiled version of cg_clif at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/releases/tag/dev">https://github.com/bjorn3/rustc_codegen_cranelift/releases/tag/dev</a> if you want to try it out.</p>

<h1 id="achievements-in-the-past-nine-months">Achievements in the past nine months</h1>

<h4 id="perf-improvements">Perf improvements</h4>

<p>Debug assertions were accidentally enabled for the precompiled dev releases. Disabling them significantly improved performance from a ~13% improvement of cg_clif over cg_llvm to a ~39% improvement on one benchmark. Local builds have not been affected by this issue.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1347">#1347</a>: Build CI dist artifacts without debug assertions</li>
</ul>

<h4 id="simd">SIMD</h4>

<p>A lot of vendor intrinsics have been implemented. The regex crate now works on AVX2 systems without cg_clif’s hack to make <code class="language-plaintext highlighter-rouge">is_x86_feature_detected!()</code> hide all features other than SSE and SSE2. This hack doesn’t work when the standard library is compiled using cg_llvm as will be the case when cg_clif gets distributed with rustup.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1297">#1297</a>: Implement some AArch64 SIMD intrinsics</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1309">#1309</a>: Implement simd_gather and simd_scatter</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1378">#1378</a>: Implement all vendor intrinsics used by regex on AVX2 systems</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/e4d0811360e79b2789f27a65eed7d3248e1e092c">e4d0811</a>: Implement _mm_srli_epi16 and _mm_slli_epi16</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/c09ef968782c8ada9aa5427605b1b7925ac60d32">c09ef96</a>: Implement _mm_shuffle_epi8</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1380">#1380</a>: Implement a whole bunch more x86 vendor intrinsics</li>
</ul>

<h4 id="build-system-rework">Build system rework</h4>

<p>The build system has seen a significant rework to allow using it to test a precompiled cg_clif version and to allow vendoring of everything for offline builds. This was a requirement to testing cg_clif in rust’s CI. A PR is open to run part of cg_clif’s tests in rust’s CI.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1291">#1291</a>: Move downloaded test project to downloads/</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1298">#1298</a>: Introduce CargoProject type and use it where possible</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1300">#1300</a>: Rename the build/ directory to dist/</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1302">#1302</a>: Allow specifying where build artifacts should be written to</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1338">#1338</a>: Avoid clobbering build_system/ and ~/.cargo/bin</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1339">#1339</a>: Many build system improvements</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1340">#1340</a>: Push up a lot of rustc and cargo references</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1341">#1341</a>: Refactor sysroot building</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1374">#1374</a>: Allow building and testing without rustup</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/5b3bc29008643203b4de3ffb4c5b5141039c88e6">5b3bc29</a>: Allow testing a cranelift backend built into rustc itself</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/134dc334857e453c50f8ea31b13cbda106204f20">134dc33</a>: Fix testing with unstable features disabled</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1357">#1357</a>: Support testing of cg_clif in rust’s CI</li>
  <li><a href="https://github.com/rust-lang/rust/pull/112701">rust#112701</a>: Run part of cg_clif’s tests in CI (not yet merged)</li>
</ul>

<h4 id="inline-assembly">Inline assembly</h4>

<p><code class="language-plaintext highlighter-rouge">const</code> operands for <code class="language-plaintext highlighter-rouge">inline_asm!()</code> and <code class="language-plaintext highlighter-rouge">global_asm!()</code> are now supported. <code class="language-plaintext highlighter-rouge">sym</code> operands work in some cases, but if rustc decides to make the respective function private to the codegen unit it is contained in, you will get a linker error as inline asm ends up in a separate codegen unit while rustc thinks it ends up in the same codegen unit.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1350">#1350</a>: Implement const and sym operands for inline asm</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1351">#1351</a>: Implement const and sym operands for global asm</li>
</ul>

<h4 id="s390x-support-tested-in-ci">s390x support tested in CI</h4>

<p>@afonso360 contributed CI support for testing s390x.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1304">#1304</a>: Add S390X CI Support</li>
</ul>

<h4 id="archive-writer">Archive writer</h4>

<p>As I already pointed out in a <a href="https://bjorn3.github.io/2022/06/13/progress-report-june-2022.html#migrating-away-from-rust-ar">previous</a> progress report I had been working on switching out the archive writer from a fork of rust-ar to a rewrite of LLVM’s archive writer. This work has since been completed. The LLVM backend still uses LLVM’s original version because a couple of regressions were found in the integration with rustc. I plan to fix those issues and switch the LLVM backend to the rust rewrite some time in the future.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/1155">#1155</a>: Remove the ar git dependency</li>
  <li><a href="https://github.com/rust-lang/rust/pull/97485">rust#97485</a>: Rewrite LLVM’s archive writer in Rust</li>
</ul>

<h4 id="benchmark-improvements">Benchmark improvements</h4>

<p>Release builds of simple-raytracer are now benchmarked too. Release builds are slower but should still be faster than the LLVM backend. At the same time the resulting executables are about 20% faster and for simple-raytracer faster than LLVM in debug mode.</p>

<p>CI runs now also show the benchmark results if you scroll down on the overview page of the workflow run. See for example <a href="https://github.com/bjorn3/rustc_codegen_cranelift/actions/runs/5645453142">https://github.com/bjorn3/rustc_codegen_cranelift/actions/runs/5645453142</a>.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1373">#1373</a>: Benchmark clif release builds with ./y.rs bench</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/448b7a3a12e6e76547c95cd327d83b2c7dff3c65">448b7a3</a>: Record GHA step summaries for benchmarking</li>
</ul>

<h1 id="challenges">Challenges</h1>

<h4 id="simd-1">SIMD</h4>

<p>While <code class="language-plaintext highlighter-rouge">core::simd</code> is fully supported through emulation using scalar operations, many platform specific vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code> are not supported. This has been improving though with the most important (as far as the regex crate and its dependencies are concerned) x86 vendor intrinsics implemented.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding. I’m working on implementing this and integrating it with cg_clif.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h4 id="distributing-as-rustup-component">Distributing as rustup component</h4>

<p>There is progress towards distributing cg_clif as a rustup component. For example a decent amount of SIMD vendor intrinsics are now implemented and there is an open PR to run part of cg_clif’s test suite on rust’s CI. There are still things to be done though. https://github.com/bjorn3/rustc_codegen_cranelift/milestone/2 lists things I know of that still need to be done.</p>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[It has been quite a while since the last progress report. A ton of progress has been made since then, but I simply didn’t get around writing a new progress report. There have been 639 commits since the last progress report. This is significantly more than the last time given how long there has been since the last progress report. As such I skimmed the commit list to see what stood out to me. I may have missed some important things.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (Oct 2022)</title><link href="/2022/10/12/progress-report-okt-2022.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (Oct 2022)" /><published>2022-10-12T00:00:00+00:00</published><updated>2022-10-12T00:00:00+00:00</updated><id>/2022/10/12/progress-report-okt-2022</id><content type="html" xml:base="/2022/10/12/progress-report-okt-2022.html"><![CDATA[<p>There has a ton of progress since the <a href="https://bjorn3.github.io/2022/06/13/progress-report-june-2022.html">last progress report</a>. There have been <a href="https://github.com/bjorn3/rustc_codegen_cranelift/compare/ec841f58d38e5763bc0ad9f405ed5fa075e3fd30...69297f9c863f0e153d10447685b9a2cc34f60d57">303 commits</a> since then. @afonso360 has been contributing a ton to improve Windows and AArch64 support. (Thanks a lot for that!)</p>

<h1 id="achievements-in-the-past-four-months">Achievements in the past four months</h1>

<h4 id="windows-support-with-the-msvc-toolchain">Windows support with the MSVC toolchain</h4>

<p>Windows support with the MSVC toolchain has been added by @afonso360. This requires a Cranelift change to add COFF based TLS support, a rewrite of the bash scripts for testing in rust (as windows doesn’t have bash), adding inline stack probing to Cranelift (stack probing is necessary on Windows to grow the stack) and finally a couple of minor changes to tests to make them run on Windows. There are still a couple of issues though. For example the JIT mode just crashes. In addition Bevy gets miscompiled causing it to crash at runtime. An investigation into this is ongoing.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1252">#1252</a>: Move test script to y.rs</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1253">#1253</a>: Fix <code class="language-plaintext highlighter-rouge">no_sysroot</code> testsuite for MSVC environments</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/4546">bytecodealliance/wasmtime#4546</a>: cranelift: Add COFF TLS Support</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/4747">bytecodealliance/wasmtime#4747</a>: cranelift: Add inline stack probing for x64</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/1249">#1249</a>: Miscompilation of Bevy with MSVC</li>
</ul>

<h4 id="abi-fixes">Abi fixes</h4>

<p>Gankra’s <a href="https://github.com/gankra/abi-cafe">abi cafe</a> (previously abi-checker) now gets run on CI. This uncovered a couple of ABI issues between cg_clif and cg_llvm. Some were the fault of cg_clif and others had to be fixed in Cranelift.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1255">#1255</a>: Add abi-checker to y.rs and run it on CI</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/45b6cd6a8a2a3b364d22d4fabc0d72f9e37e3e50">45b6cd6a8a2a3b364d22d4fabc0d72f9e37e3e50</a>: Fix a crash for 11 single byte fields passed through the C abi</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/4634">bytecodealliance/wasmtime#4634</a>: Fix sret for AArch64</li>
</ul>

<h4 id="aarch64-support">AArch64 support</h4>

<p>Linux on AArch64 now passes the full test suite of cg_clif. It is not tested in CI, so it is possible that support will regress in the future.</p>

<h4 id="basic-s390x-support">Basic s390x support</h4>

<p>Basic support for IBM’s s390x architecture has been added by @uweigand. There is no testing on CI and there are still some test failures.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1260">#1260</a>: Ignore ptr_bitops_tagging test on s390x</li>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/1258">#1258</a>: s390x test failure due to unsupported stack realignment</li>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/1259">#1259</a>: Enabling s390x on CI</li>
</ul>

<h4 id="multi-threading-support">Multi-threading support</h4>

<p>The LLVM backend has supported multi-threading during compilation from LLVM IR to object files since <a href="https://github.com/rust-lang/rust/pull/16367">2014</a>. While the frontend is not parallelized, this can still give a non-trivial perf boost. Cg_clif until recently didn’t support this, causing it to take longer to compile especially on machines with many cores. After doing significant refactorings all over cg_clif for about two weeks I was able to implement multi-threading support in cg_clif too. It was a lot of effort, but it was well worth it. There are almost no cases where cg_llvm is faster than cg_clif now.</p>

<details><summary>The perf results (warning: long image)</summary>

<img src="https://user-images.githubusercontent.com/17426603/186444984-05a1362a-60c8-486f-bdcd-01bcdab87e52.png" alt="wall time on the rustc perf suite when compared to cg_llvm which shows almost all benchmarks having a significant improvement" />

</details>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1264">#1264</a>: Refactorings for enabling parallel compilation (part 1)</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1266">#1266</a>: Refactorings for enabling parallel compilation (part 2)</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1271">#1271</a>: Support compiling codegen units in parallel</li>
</ul>

<h4 id="inline-assembly">Inline assembly</h4>

<p>While working on implementing multi-threading I was able to remove the partial linking hack that was used for supporting inline assembly and incremental compilation at the same time. This hack was incompatible with macOS. Now that it is no longer necessary inline assembly works on macOS too.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/e45f6000a0bd46d4b7580db59c86f3d30adbc270">e45f600</a>: Remove the partial linking hack for global asm support</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/f76ca2247998bff4e10b73fcb464a0a83edbfeb0">f76ca22</a>: Enable inline asm on macOS</li>
</ul>

<h4 id="portable-simd">Portable simd</h4>

<p>I implemented a couple of intrinsics used by <code class="language-plaintext highlighter-rouge">core::simd</code>. Only <code class="language-plaintext highlighter-rouge">simd_scatter</code>, <code class="language-plaintext highlighter-rouge">simd_gather</code> and <code class="language-plaintext highlighter-rouge">simd_arith_offset</code> are missing now. Note that a large portion of <code class="language-plaintext highlighter-rouge">core::arch</code> is still unimplemented.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1277">#1277</a>: Implement a couple of portable simd intrinsics</li>
</ul>

<h1 id="challenges">Challenges</h1>

<h4 id="simd">SIMD</h4>

<p>Many vendor intrinsics remain unimplemented. The new portable SIMD project will however likely exclusively use so called “platform intrinsics” of which there are much fewer, compared to the LLVM intrinsics used to implement all vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code>. In addition “platform intrinsics” are the common denominator between platforms supported by rustc, so they only have to be implemented once in cg_clif itself and in fact most have already been implemented. Cranelift does need a definition for each platform when native SIMD is used, but emulating “platform intrinsics” using scalar instructions is pretty easy.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h4 id="distributing-as-rustup-component">Distributing as rustup component</h4>

<p>There is progress towards distributing cg_clif as rustup components, but there are still things to be done. https://github.com/bjorn3/rustc_codegen_cranelift/milestone/2 lists things I know of that still needs to be done.</p>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[There has a ton of progress since the last progress report. There have been 303 commits since then. @afonso360 has been contributing a ton to improve Windows and AArch64 support. (Thanks a lot for that!)]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (June 2022)</title><link href="/2022/06/13/progress-report-june-2022.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (June 2022)" /><published>2022-06-13T00:00:00+00:00</published><updated>2022-06-13T00:00:00+00:00</updated><id>/2022/06/13/progress-report-june-2022</id><content type="html" xml:base="/2022/06/13/progress-report-june-2022.html"><![CDATA[<p>It’s been quite a while since the <a href="https://bjorn3.github.io/2021/08/05/progress-report-july-2021.html">last progress report</a>. There have been <a href="https://github.com/bjorn3/rustc_codegen_cranelift/compare/05677b6bd6c938ed760835d9b1f6514992654ae3...ec841f58d38e5763bc0ad9f405ed5fa075e3fd30">393 commits</a> since the last progress report.</p>

<h1 id="achievements-in-the-past-ten-months">Achievements in the past ten months</h1>

<h4 id="migrating-away-from-rust-ar">Migrating away from rust-ar</h4>

<p>Since the start archive file reading and writing has been done by the <a href="https://github.com/mdsteele/rust-ar">rust-ar</a> crate. While is has been very useful, there are a couple of limitations that necessitate moving away from it. First off it doesn’t support writing symbol tables. While I managed to implement support for it with the Gnu and BSD variants of the archive format, it doesn’t work with macOS, thus requiring usage of <code class="language-plaintext highlighter-rouge">ranlib</code> on macOS, which is slower than writing the symbol table while creating the archive file. Second my changes to support symbol table writing haven’t been merged into rust-ar, which means that cg_clif has to depend on my own fork. This means that if I accidentally delete my fork, cg_clif would be broken. In addition it doesn’t play nice with vendoring as necessary for building rust offline. And finally rust-ar is not actively maintained.</p>

<p>To migrate away from I first switched archive file reading from rust-ar to the newly introduced archive file support in the object crate. I’m now working on integrating a port of LLVM’s archive writer to rust with rustc so all backends can share the same code.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/1155">#1155</a>: Remove the ar git dependency</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/1da50543dd6d1778856e24433d186fd39327def1">1da5054</a>: Use the object crate for archive reading during archive building</li>
  <li><a href="https://github.com/rust-lang/rust/pull/97485">rust-lang/rust#97485</a>: Rewrite LLVM’s archive writer in Rust</li>
</ul>

<h4 id="multi-threading-support">Multi-threading support</h4>

<p>Currently cg_clif does everything on a single thread, unlike cg_llvm which does optimizations and emitting object files in parallel. This means that depending on how many codegen units can be compiled in parallel cg_llvm can finish in less time than cg_clif. I have been slowly working on refactorings that will allow Cranelift to compile codegen units on background threads. These refactorings are necessary as currently a function is immediately compiled after it has been translated to cranelift ir.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/9089c305dad582cf0da4b84cad27b6fab54434b9">9089c30</a>: Remove TyCtxt dependency from UnwindContext</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/5f6c59e63faf0705d4c6e1fbd7a66ffc59b9ae1f">5f6c59e</a>: Pass only the Function to write_clif_file</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/78b65718bce8d7f8b2e1d1af74141cebbd78cf5f">78b6571</a>: Split compile_fn out of codegen_fn</li>
</ul>

<h4 id="simd">SIMD</h4>

<p>There have been a lot of fixes for portable-simd (the unstable <code class="language-plaintext highlighter-rouge">core::simd</code> module). Part of these also benefit stdarch (the <code class="language-plaintext highlighter-rouge">core::arch</code> module).</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/a8be7ea503211115d7e6339942544268de99bf17">a8be7ea</a>: Implement new simd_shuffle signature</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/d288c6924d15e3202f006997167be0e54d307079">d288c69</a>: Implement simd_reduce_{min,max} for floats</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/dd288d27de23e2f2180e71b6e9b36789ba388e6f">dd288d2</a>: Fix vector types containing an array field with mir opts enabled</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/037aafbbaf2ee41a11807a1abdec24eb23f505c2">037aafb</a>: Fix simd type validation</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/f3d97cce279fd2372aafec3761791b4110d70bf5">f3d97cc</a>: Fix saturating float casts test</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/3c030e2425bb1fdb165ac87797076072ec991970">3c030e2</a>: Fix NaN handling of simd float min and max operations</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/11007c02f70130cdc70b98f0909e5c150a2751a6">11007c0</a>: Use fma(f) libm function for simd_fma intrinsic</li>
</ul>

<h4 id="inline-assembly">Inline assembly</h4>

<p><a href="https://github.com/nbdd0121"><code class="language-plaintext highlighter-rouge">@nbdd0121</code></a> implemented support for register classes in PR #1206. Previously only fixed register constraints were supported.</p>

<p>I also fixed a couple of bugs in an attempt to compile Philipp Oppermann’s <a href="https://os.phil-opp.com/">blog os</a>. There are still many things missing for that to work though.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1206">#1206</a>: Improve inline asm support</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/122219237437ee1deee33df9806a4316194a6f76">1222192</a>: Use cgu name instead of function name as base for inline asm wrapper name</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/efdbd88a741074a799563ef08c96ff92905fbc1c">efdbd88</a>: Ensure inline asm wrapper name never starts with a digit</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/1204">#1204</a>: Full asm!() support</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/1208">#1208</a>: Support compiling blog os</li>
</ul>

<h4 id="misc-bug-fixes">Misc bug fixes</h4>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/f74cf39a7434c73424b9e5fddaf78996bd2b06c1">f74cf39</a>: Fix crash when struct argument size is not a multiple of the pointer size</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/97e504549371d7640cf011d266e3c17394fdddac">97e5045</a>: Fix taking address of truly unsized type field of unsized adt</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/f3fc94f2399e8244bb78af8e0e5f462b884083ac">f3fc94f</a>: Fix #[track_caller] with MIR inlining</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/f52162f75c640618637e265d005f0f5f25811af5">f52162f</a>: Fix #[track_caller] location for function chains</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/74b9232ee8001b6204a3c357a7793a6d152bd8ca">74b9232</a>: Fix assert_assignable for array types</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/7a10059268e456ec89aa05e4df23a2b19b4d8395">7a10059</a>: Fix symbol tables in case of multiple object files with the same name</li>
</ul>

<h4 id="usage-changes">Usage changes</h4>

<p>There have two big changes to the way cg_clif is used. First of the cargo wrapper executable has been renamed to cargo-clif. This is necessary on windows as otherwise the cargo wrapper would invoke itself when running cargo due to windows putting the current working directory in the search path for executables. It also allows invoking the wrapper as <code class="language-plaintext highlighter-rouge">cargo clif</code> in case you add the cg_clif build directory to your <code class="language-plaintext highlighter-rouge">$PATH</code>. The second change is that cg_clif is now always run using the <code class="language-plaintext highlighter-rouge">-Zcodegen-backend</code> rustc argument. This matches what happens when building cg_clif as part of rustc. Previously a wrapper <code class="language-plaintext highlighter-rouge">cg_clif</code> executable was used which uses rustc_driver to run rustc with cg_clif as backend. This change is only visible when you are directly using cg_clif/rustc without the <code class="language-plaintext highlighter-rouge">cargo-clif</code> wrapper. Usage of <code class="language-plaintext highlighter-rouge">cargo-clif</code> is advised.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/0dd3d28cff91ed450e296efa4b9e7db9fb91373b">0dd3d28</a>: Rename cargo executable to cargo-clif</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1225">#1225</a>: Use -Zcodegen-backend instead of a custom rustc driver</li>
</ul>

<h4 id="perf-optimizations">Perf optimizations</h4>

<p>Both build time and runtime performance should be improved by several percent due to a couple of optimizations. A small improvement is the new support of Cranelift for cold blocks. These are placed at the end of the function to enable more efficient usage of the instruction cache and to reduce branch mispredictions, which slightly improves runtime performance. A much bigger improvement is the replacement of a lot of print+trap combinations with just a trap. While the prints have been very useful for debugging miscompilations, they also bloat compiled binaries a lot (up to ~30% improvement from removing them!). Given that miscompilations in cg_clif are quite rare nowadays, I removed most debug prints. The final improvement is caused by Cranelift switching to a new register allocator. This has improved build time by up to 7% and should also have improved runtime performance a bit.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/90f8aefe7142d23a64ae95b5ae5a292a6e0519db">90f8aef</a>: Mark cold blocks</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1220">#1220</a>: Replace a lot of print+trap with plain trap</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/3989">bytecodealliance/wasmtime#3989</a>: Switch Cranelift over to regalloc2</li>
</ul>

<h1 id="challenges">Challenges</h1>

<h4 id="windows-support-with-the-msvc-toolchain">Windows support with the MSVC toolchain</h4>

<p>Cranelift doesn’t yet support TLS for COFF/PE object files. This means that unlike MinGW which uses pthread keys for implementing TLS, it is not currently possible to compile for MSVC.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1885">wasmtime#1885</a>: [Cranelift] Add COFF TLS support</li>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/977">#997</a>: Windows support</li>
</ul>

<h4 id="simd-1">SIMD</h4>

<p>Many vendor intrinsics remain unimplemented. The new portable SIMD project will however likely exclusively use so called “platform intrinsics” of which there are much fewer, compared to the LLVM intrinsics used to implement all vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code>. In addition “platform intrinsics” are the common denominator between platforms supported by rustc, so they only have to be implemented once in cg_clif itself and in fact most have already been implemented. Cranelift does need a definition for each platform when native SIMD is used, but emulating “platform intrinsics” using scalar instructions is pretty easy.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[It’s been quite a while since the last progress report. There have been 393 commits since the last progress report.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (July 2021)</title><link href="/2021/08/05/progress-report-july-2021.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (July 2021)" /><published>2021-08-05T00:00:00+00:00</published><updated>2021-08-05T00:00:00+00:00</updated><id>/2021/08/05/progress-report-july-2021</id><content type="html" xml:base="/2021/08/05/progress-report-july-2021.html"><![CDATA[<p>Since the <a href="https://bjorn3.github.io/2021/04/13/progress-report-april-2021.html">last progress report</a> there have been <a href="https://github.com/bjorn3/rustc_codegen_cranelift/compare/29a4a551eb23969cde9a895d081bee682254974c...05677b6bd6c938ed760835d9b1f6514992654ae3">242 commits</a>.</p>

<h1 id="achievements-in-the-past-four-months">Achievements in the past four months</h1>

<h4 id="simd">SIMD</h4>

<p>Almost all integer tests and float tests of <a href="https://github.com/rust-lang/portable-simd/">portable-simd</a> (formerly stdsimd) now pass. A couple of operations are not yet implemented, but other than that it now works just fine.</p>

<p>In addition <a href="https://github.com/shamatar">@shamatar</a> implemented the <code class="language-plaintext highlighter-rouge">llvm.x86.addcarry.64</code> and <code class="language-plaintext highlighter-rouge">llvm.x86.subborrow.64</code> instrinsics as their first contribution. They are used by some of the <code class="language-plaintext highlighter-rouge">core::arch</code> SIMD intrinsics that the <code class="language-plaintext highlighter-rouge">num-bigint</code> crate uses.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1189">#1189</a>: Improve stdsimd support</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1178">#1178</a>: Implement llvm.x86.addcarry.64 and llvm.x86.subborrow.64</li>
</ul>

<h4 id="aarch64-support-on-linux">AArch64 support on Linux</h4>

<p>It is now possible to cross-compile to AArch64 Linux. Native compilation should work too, but isn’t tested. At the moment there does seem to be an ABI incompatibility around proc-macros though, so those don’t work when using native compilation.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1183">#1183</a>: AArch64 support on Linux</li>
</ul>

<h4 id="-ctarget-cpu-support"><code class="language-plaintext highlighter-rouge">-Ctarget-cpu</code> support</h4>

<p>Thanks to <a href="https://github.com/mominul"><code class="language-plaintext highlighter-rouge">@mominul</code></a> it is now possible to use <code class="language-plaintext highlighter-rouge">-Ctarget-cpu</code> with cg_clif. The given value is passed directly to Cranelift, so not every target cpu supported by LLVM is allowed, but <code class="language-plaintext highlighter-rouge">-Ctarget-cpu=native</code> works fine as well as the list of target cpus <a href="https://github.com/bytecodealliance/wasmtime/blob/85f16f488d4a0047e40a885fdacda832d46815e8/cranelift/codegen/meta/src/isa/x86/settings.rs#L168-L212">supported</a> by Cranelift.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1163">#1163</a>: Support <code class="language-plaintext highlighter-rouge">-Ctarget-cpu</code></li>
</ul>

<h4 id="rust-build-system">Rust build system</h4>

<p>The most important parts of the build system have been rewritten from bash scripts to rust code. This allows it to run on systems that don’t have bash like Windows. It is still necessary for git to be available though.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1180">#1180</a>: Rewrite part of the build system in rust</li>
</ul>

<h4 id="multithreading-support-for-the-jit-mode">Multithreading support for the JIT mode</h4>

<p><a href="https://github.com/eggyal"><code class="language-plaintext highlighter-rouge">@eggyal</code></a> implemented multithreading support for the lazy-jit mode. When a function is called that still needs to be lazily compiled, this compilation happens on the main rustc thread. This blocks compilation of other functions, but doesn’t interrupt other threads if they don’t need any function to be compiled.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1166">#1166</a>: Multithreading support for lazy-jit</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2786">bytecodealliance/wasmtime#2786</a>: Atomic hotswapping in JIT mode</li>
</ul>

<h1 id="challenges">Challenges</h1>

<p>While there are several important things currently missing, I am confident that I will be able to implement the most important ones in 2021.</p>

<h4 id="windows-support-with-the-msvc-toolchain">Windows support with the MSVC toolchain</h4>

<p>Cranelift doesn’t yet support TLS for COFF/PE object files. This means that unlike MinGW which uses pthread keys for implementing TLS, it is not currently possible to compile for MSVC.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1885">wasmtime#1885</a>: [Cranelift] Add COFF TLS support</li>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/977">#997</a>: Windows support</li>
</ul>

<h4 id="simd-1">SIMD</h4>

<p>Many vendor intrinsics remain unimplemented. The new portable SIMD project will however likely exclusively use so called “platform intrinsics” of which there are much fewer, compared to the LLVM intrinsics used to implement all vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code>. In addition “platform intrinsics” are the common denominator between platforms supported by rustc, so they only have to be implemented once in cg_clif itself and in fact most have already been implemented. Cranelift does need a definition for each platform when native SIMD is used, but emulating “platform intrinsics” using scalar instructions is pretty easy.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>

<p>Thanks to <a href="https://github.com/cfallin">@cfallin</a> for giving feedback on this progress report.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[Since the last progress report there have been 242 commits.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (April 2021)</title><link href="/2021/04/13/progress-report-april-2021.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (April 2021)" /><published>2021-04-13T00:00:00+00:00</published><updated>2021-04-13T00:00:00+00:00</updated><id>/2021/04/13/progress-report-april-2021</id><content type="html" xml:base="/2021/04/13/progress-report-april-2021.html"><![CDATA[<p>Since the <a href="https://bjorn3.github.io/2021/02/01/progress-report-jan-2021.html">last progress report</a> there have been <a href="https://github.com/bjorn3/rustc_codegen_cranelift/compare/d556c56f792756dd7cfec742b9f2e07612dc10f4...29a4a551eb23969cde9a895d081bee682254974c">135 commits</a>.</p>

<h1 id="achievements-in-the-past-three-months">Achievements in the past three months</h1>

<h4 id="removed-support-for-old-style-cranelift-backends">Removed support for old style Cranelift backends</h4>

<p>In the <a href="https://bjorn3.github.io/2021/02/01/progress-report-jan-2021.html">previous</a> progress report I mentioned that I switched to using the new-style Cranelift backends by default. At the time I kept support for the old-style backends just in case I would find a critical bug. There haven’t been any issues with the new backend since, so support for old-style backends has been removed.</p>

<ul>
  <li>commit <a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/92f765fce96b6344ccfe9b288bbd8b652f5ad0ef">92f765f</a>: Remove support for x86 oldBE</li>
</ul>

<h4 id="atomics">Atomics</h4>

<p>Atomic operations are now implemented using native atomic instructions instead of being emulated using a global lock. This is much more efficient and also works when pthreads is not available. As only new-style backends implement them, I couldn’t use them until support for the old-style backends was removed.</p>

<ul>
  <li>commit <a href="https://github.com/bjorn3/rustc_codegen_cranelift/commit/f2f5452089a6cf8eb611badf20118960030f6585">f2f5452</a>: Use real atomic instructions instead of a global lock</li>
</ul>

<h4 id="cross-compilation-to-windows-using-mingw">Cross-compilation to Windows using MinGW</h4>

<p>It is now possible to cross-compile to Windows using MinGW. This required implementing a couple of things, like using the right ABI for calling intrinsics defined in compiler_builtins and adding cross-compilation support to the build system of cg_clif.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1145">#1145</a>: Support cross-compiling to Windows using MinGW</li>
</ul>

<h4 id="run-the-rustc-test-suite-on-ci">Run the rustc test suite on CI</h4>

<p>The rustc test suite is now run on CI by default to prevent regressions. A <a href="https://github.com/bjorn3/rustc_codegen_cranelift/blob/29a4a551eb23969cde9a895d081bee682254974c/scripts/test_rustc_tests.sh#L13-L85">lot</a> of tests are currently ignored, but most tests are either LLVM specific (eg asm tests) or require unimplemented features like panicking.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1149">#1149</a>: Run the rustc test suite on CI</li>
</ul>

<h1 id="challenges">Challenges</h1>

<p>While there are several important things currently missing, I am confident that I will be able to implement the most important ones in 2021.</p>

<h4 id="windows-support-with-the-msvc-toolchain">Windows support with the MSVC toolchain</h4>

<p>Cranelift doesn’t yet support TLS for COFF/PE object files. This means that unlike MinGW which uses pthread keys for implementing TLS, it is not currently possible to compile for MSVC.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1885">wasmtime#1885</a>: [Cranelift] Add COFF TLS support</li>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/977">#997</a>: Windows support</li>
</ul>

<h4 id="simd">SIMD</h4>

<p>Many vendor intrinsics remain unimplemented. The new portable SIMD project will however likely exclusively use so called “platform intrinsics” of which there are much fewer, compared to the LLVM intrinsics used to implement all vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code>. In addition “platform intrinsics” are the common denominator between platforms supported by rustc, so they only have to be implemented once in cg_clif itself. Cranelift does need a definition for each platform when native SIMD is used, but emulating “platform intrinsics” using scalar instructions is pretty easy.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h1 id="contributing">Contributing</h1>

<p>Contributions are always appreciated. Feel free to take a look at <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22">good first issues</a> and ping me (@bjorn3) for help on either the relevant github issue or preferably on the <a href="https://rust-lang.zulipchat.com">rust lang</a> zulip if you get stuck.</p>

<p>Thanks to <a href="https://githhub.com/bnjbvr">@bnjbvr</a> for giving feedback on this progress report.</p>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[Since the last progress report there have been 135 commits.]]></summary></entry><entry><title type="html">Progress report on rustc_codegen_cranelift (Jan 2021)</title><link href="/2021/02/01/progress-report-jan-2021.html" rel="alternate" type="text/html" title="Progress report on rustc_codegen_cranelift (Jan 2021)" /><published>2021-02-01T00:00:00+00:00</published><updated>2021-02-01T00:00:00+00:00</updated><id>/2021/02/01/progress-report-jan-2021</id><content type="html" xml:base="/2021/02/01/progress-report-jan-2021.html"><![CDATA[<p><a href="https://github.com/bjorn3/rustc_codegen_cranelift">Rustc_codegen_cranelift</a> (cg_clif) is an alternative backend for rustc that I have been working on for the past two years. It uses the Cranelift code generator. Unlike LLVM which is optimized for output quality at the cost of compilation speed even when optimizations are disabled, Cranelift is optimized for compilation speed while producing executables that are almost as fast as LLVM with optimizations disabled. This has the potential to reduce the compilation times of rustc in debug mode.</p>

<p>Since the <a href="https://bjorn3.github.io/2021/01/07/progress-report-dec-2020.html">last progress report</a> there have been <a href="https://github.com/bjorn3/rustc_codegen_cranelift/compare/dbee13661efa269cb4cd57bb4c6b99a19732b484...d556c56f792756dd7cfec742b9f2e07612dc10f4">54 commits</a>.</p>

<h1 id="achievements-in-the-past-months">Achievements in the past months</h1>

<h4 id="tada-abi-compatibility-tada">:tada: ABI compatibility :tada:</h4>

<p>The biggest achievement this time is ABI compatibility with cg_llvm and C. This fixed several crashes when linking against C code. This also makes it possible to mix and match crates compiled with cg_clif and compiled with cg_llvm. This may be useful for game development by compiling the game engine using cg_llvm with optimizations enabled for runtime performance and then compiling the game logic using cg_clif for incremental compilation time.</p>

<p>There is currently no easy way to mix codegen backends for different crates, but I do have a cargo PR open that would allow it. I do not expect it to land as is, but I hope something like it will be merged.</p>

<ul>
  <li><a href="https://github.com/rust-lang/rust/pull/80594">rust#80594</a>: Various ABI refactorings</li>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/10">#10</a>: C abi compatability</li>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pull/1131">#1131</a>: Full abi compatibilty</li>
  <li><a href="https://github.com/rust-lang/cargo/pull/9118">cargo#9118</a>: Add a profile option to select the codegen backend</li>
</ul>

<h4 id="switch-to-the-new-backend-framework-of-cranelift">Switch to the new backend framework of Cranelift</h4>

<p>Cranelift is currently switching to a new backend framework. This framework produces faster code and has support for AArch64. Since the last progress report <a href="https://github.com/cfallin">@cfallin</a> has landed all features and bug fixes necessary to compile using the x64 backend based on the new framework. This allowed me to switch to it by default. So far no new problems have surfaced, but I plan to retain compatibility with the old backend for a little bit longer just in case.</p>

<ul>
  <li><a href="https://cfallin.org/blog/2020/09/18/cranelift-isel-1/">https://cfallin.org/blog/2020/09/18/cranelift-isel-1/</a></li>
  <li><a href="https://cfallin.org/blog/2021/01/22/cranelift-isel-2/">https://cfallin.org/blog/2021/01/22/cranelift-isel-2/</a></li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2538">wasmtime#2538</a>: Multi-register value support: framework for Values wider than machine registers.</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2539">wasmtime#2539</a>: Support for I128 operations in x64 backend.</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2540">wasmtime#2540</a>: Add ELF TLS support in new x64 backend.</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2541">wasmtime#2541</a>: x64 and aarch64: allow StructArgument and StructReturn args.</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2558">wasmtime#2558</a>: x64: support PC-rel symbol references using the GOT when in PIC mode.</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2595">wasmtime#2595</a>: Implement Mach-O TLS access for x64 newBE</li>
</ul>

<h1 id="challenges">Challenges</h1>

<p>While there are several important things currently missing, I am confident that I will be able to implement the most important things in 2021.</p>

<h4 id="atomics">Atomics</h4>

<p>Atomic instructions are currently emulated using a global lock. This is very inefficient and only works when pthreads is available. The new style backends for Cranelift have native support for atomic instructions. I will switch to them once I drop support for the old style x86 backend.</p>

<ul>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2077">wasmtime#2077</a>: Implement Wasm Atomics for Cranelift/newBE/aarch64.</li>
  <li><a href="https://github.com/bytecodealliance/wasmtime/pull/2149">wasmtime#2149</a>: This patch fills in the missing pieces needed to support wasm atomics…</li>
</ul>

<h4 id="windows-support">Windows support</h4>

<p>Various issues. See issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/977">#997</a> for more information.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1885">wasmtime#1885</a>: [Cranelift] Add COFF TLS support</li>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/977">#997</a>: Windows support</li>
  <li>branch <a href="https://github.com/bjorn3/rustc_codegen_cranelift/compare/wip_windows_support3">wip_windows_support3</a></li>
</ul>

<h4 id="simd">SIMD</h4>

<p>Many vendor intrinsics remain unimplemented. The new portable SIMD project will however likely exclusively use platform intrinsics of which there are much fewer compared to the LLVM intrinsics used to implement all vendor intrinsics in <code class="language-plaintext highlighter-rouge">core::arch</code>. In addition platform intrinsics are architecture independent, so they only have to be implemented once.</p>

<ul>
  <li>issue <a href="https://github.com/bjorn3/rustc_codegen_cranelift/issues/171">#171</a>: std::arch SIMD intrinsics</li>
</ul>

<h4 id="cleanup-during-stack-unwinding-on-panics">Cleanup during stack unwinding on panics</h4>

<p>Cranelift currently doesn’t have support for cleanup during stack unwinding.</p>

<ul>
  <li>issue <a href="https://github.com/bytecodealliance/wasmtime/issues/1677">wasmtime#1677</a>: Support cleanup during unwinding</li>
</ul>

<h4 id="maintenance">Maintenance</h4>

<p>While there have been several PR’s by other people, I am the only person who has contributed more than a few changes to cg_clif.</p>

<ul>
  <li><a href="https://github.com/bjorn3/rustc_codegen_cranelift/pulls?q=is%3Apr+is%3Aclosed+-author%3Aapp%2Fdependabot-preview">https://github.com/bjorn3/rustc_codegen_cranelift/pulls?q=is%3Apr+is%3Aclosed+-author%3Aapp%2Fdependabot-preview</a></li>
</ul>]]></content><author><name></name></author><category term="cranelift" /><category term="cg_clif" /><category term="rust" /><summary type="html"><![CDATA[Rustc_codegen_cranelift (cg_clif) is an alternative backend for rustc that I have been working on for the past two years. It uses the Cranelift code generator. Unlike LLVM which is optimized for output quality at the cost of compilation speed even when optimizations are disabled, Cranelift is optimized for compilation speed while producing executables that are almost as fast as LLVM with optimizations disabled. This has the potential to reduce the compilation times of rustc in debug mode.]]></summary></entry></feed>