Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace windows with embedded bindings #778

Closed
wants to merge 7 commits into from

Conversation

Jake-Shadle
Copy link
Contributor

@Jake-Shadle Jake-Shadle commented Apr 18, 2023

This PR changes from using the windows crate to the windows-sys crate, embedding the various COM interfaces needed by this crate and then using windows-sys for the other functions/types used in the crate. This drastically reduces compile time since windows is both extremely generic as well as each feature including a ton of code of which only a small fraction is actually used by this crate. (eg. the Win32_Media_Audio feature is ~14k LOC). Note that #616 mentions https://github.com/HEnquist/wasapi-rs, however that crate is outdated and still has a dependency on windows, but I can split out the COM code in here to a separate crate if that is wanted.

  • Crate download: ~12MiB -> ~2.5MiB
  • Untarred source: ~171MiB -> ~30MiB
  • Build (debug): ~9.8s (1.2s codegen) -> ~1.5s
  • Build (release): ~11.04s (2.8s codegen) -> ~1.17s

See: #727
Closes: #616

Comment on lines 63 to 110
pub(super) type ComResult<T> = Result<T, HRESULT>;

#[repr(C)]
pub(crate) struct Interface<T> {
vtable: *mut T,
}

impl<T> Interface<T> {
#[inline]
pub(crate) fn vtbl(&self) -> &T {
unsafe { &*self.vtable }
}
}

#[repr(transparent)]
pub(super) struct Object<T>(*mut Interface<T>);

impl<T> Object<T> {
#[inline]
pub(crate) fn vtbl(&self) -> &T {
unsafe { (*self.0).vtbl() }
}

#[inline]
pub(crate) fn ptr(&self) -> *mut c_void {
self.0.cast()
}
}

impl<T> fmt::Debug for Object<T> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{:p}", self.0)
}
}

impl<T> Drop for Object<T> {
fn drop(&mut self) {
unsafe { ((*self.0.cast::<Interface<IUnknownV>>()).vtbl().release)(self.ptr()) };
}
}

impl<T> Clone for Object<T> {
fn clone(&self) -> Self {
unsafe { ((*self.0.cast::<Interface<IUnknownV>>()).vtbl().add_ref)(self.ptr()) };
Self(self.0)
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just minimal basic types for wrapping the COM pointers.

#[repr(transparent)]
struct ILanguageExceptionErrorInfo2(Object<ILanguageExceptionErrorInfo2V>);

pub(crate) fn get_error_message(e: HRESULT) -> String {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code + associated types above are just replicating the message retrieval that windows uses in its result type. Ideally this would be in its own crate but I didn't think it made sense to use both windows and windows-sys since there are already version headaches for them.

@@ -0,0 +1,680 @@
#![allow(non_snake_case)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file just has all of the audio COM objects/interfaces used by cpal, and just keeps the same interface that windows was using. All methods not used by cpal are just replaced with usize to keep the layout correct but not take the compile hit of typechecking a function pointer that is never actually used.

// We manually create this extern so that we don't have to include the "Security" feature,
// which adds 70 structs, 394 constants, 133 functions, 31 type aliases, and
// 5 unions....none of which this crate uses
windows_targets::link!("kernel32.dll" "system" fn CreateEventA(lpeventattributes: *const std::ffi::c_void, manualreset: i32, initialstate: i32, name: windows_sys::core::PCSTR) -> isize);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in the comment, we just manually bind this as otherwise the Security feature needs to be enabled which is entirely pointless for this crate.

Comment on lines 19 to 21
use super::com::audio;
use super::com::threading;
use super::com::ComResult;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We add these imports with different casing, so that the case just needs to be changed on the bindings embedded in this crate, but then the data types (eg WAVEFORMATEX) just come from windows-sys instead of windows.

Comment on lines -425 to +421
Err(ref e) if e.code() == Audio::AUDCLNT_E_DEVICE_INVALIDATED => {
return Err(SupportedStreamConfigsError::DeviceNotAvailable)
}
Err(e) => {
let description = format!("{}", e);
let err = BackendSpecificError { description };
return Err(err.into());
return Err(windows_err_to_cpal_err(e));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern came up several times, so was an opportunity to get rid of redundant code.

Comment on lines 237 to 230
debug_assert!(handles.len() <= SystemServices::MAXIMUM_WAIT_OBJECTS as usize);
debug_assert!(handles.len() <= 64 /*SystemServices::MAXIMUM_WAIT_OBJECTS as usize*/);
let result = unsafe {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SystemServices feature includes 14.6k LOC in windows and 10k LOC in windows-sys which is...kind of ridiculous when we only need 1 constant.

@Ralith
Copy link
Contributor

Ralith commented Apr 20, 2023

Awesome improvements! Would it make sense to take this a step further and use the windows-bindgen crate, to avoid even downloading the unnecessary -sys bindings?

@Jake-Shadle
Copy link
Contributor Author

I'm totally up for that, I started off with windows-sys because previous PRs I've made that get rid of dependencies altogether have all gotten push back which has been annoying, but it's up to the maintainers IMO so if you're up for it then that's even better.

@Ralith
Copy link
Contributor

Ralith commented Apr 20, 2023

I can't speak for maintainership, but it's at least an interesting possibility.

@est31
Copy link
Member

est31 commented Apr 20, 2023

but I can split out the COM code in here to a separate crate if that is wanted.

What about filing a PR to the windows crate directly to further split features or suggesting it to them? In this discussion the windows maintainers seem to be kinda open to suggestions.

Generally I'm not the greatest fan of the windows not having stopped with the rapid release schedule. This is not good for cpal in specific and the crates ecosystem in general. (Edit: switching to windows-sys doesn't directly put us into a better situation here, just that the cost that two versions of windows-sys in a crate graph incur is lower).

@est31
Copy link
Member

est31 commented Apr 20, 2023

I haven't made up my mind about this PR yet, and haven't given it a very close look yet, but thanks for filing it because it's an important thing to discuss -- I've suggested switching to the -sys crate earlier, and it's really amazing to see such build time improvements. But on the other hand, the diff is larger than I'd like. Accepting any PR means for me as a maintainer that I either have to maintain that merged code in some fashion, or have to do work to revert to the old state.

So I don't know how that code should be maintained. Where is it from? Was it auto-generated? What if windows_sys changes their API, is there a way to auto-generate it again? If so, maybe that should be made as easy as possible, i.e. not putting it into files with manually written code, isolating it into one directory, and documenting how to do it. Putting it into a separate crate isn't that helpful, it means an overhead when publishing things if the crate is ours, and also additional complexity for doing changes whether it is ours or not.

@Jake-Shadle
Copy link
Contributor Author

What about filing a PR to the windows crate directly to further split features or suggesting it to them?

I don't know if something like that would ever happen. The feature granularity is essentially just inherited from the windows metadata namespaces, so either those namespaces would need to be more granular, or the windows crate would need to add additional sub features on top of them specifically when creating Rust bindings, which seems tedious for both the maintainers and end users. IMO the "bind what your crate needs" approach used in this PR makes the most sense.

Generally I'm not the greatest fan of the windows not having stopped with the rapid release schedule. This is not good for cpal in specific and the crates ecosystem in general. (Edit: switching to windows-sys doesn't directly put us into a better situation here, just that the cost that two versions of windows-sys in a crate graph incur is lower).

I'm very much aware and agree since I was the one who opened microsoft/windows-rs#1720, I would have removed windows-sys completely but I've gotten pushback in several previous PRs about this, so keeping windows-sys was my attempt at increasing the chances this PR would be merged, removing it is entirely is easily doable though.

I haven't made up my mind about this PR yet, and haven't given it a very close look yet, but thanks for filing it because it's an important thing to discuss -- I've #726, and it's really amazing to see such build time improvements. But on the other hand, the diff is larger than I'd like. Accepting any PR means for me as a maintainer that I either have to maintain that merged code in some fashion, or have to do work to revert to the old state.

Understandable, I'm opening this in the hopes of merging, but even if it isn't merged in this current state at least starting a discussion of changes how windows is interacted with, in whatever form that takes. This crate is one of only 3 in our dependency graph (of many hundreds) that depend on windows so we're motivated to find a solution.

(the other two)

So I don't know how that code should be maintained. Where is it from? Was it auto-generated? What if windows_sys changes their API, is there a way to auto-generate it again? If so, maybe that should be made as easy as possible, i.e. not putting it into files with manually written code, isolating it into one directory, and documenting how to do it. Putting it into a separate crate isn't that helpful, it means an overhead when publishing things if the crate is ours, and also additional complexity for doing changes whether it is ours or not.

The current code is a mix of copy paste of the vtables from the windows crate and just manual implementation of the methods to preserve the existing API. I already have a tool to do bindgen of the windows-sys level of the API, but it doesn't have COM support, though that would be trivial to add, that way one could generate the vtables (excluding the methods that the crate doesn't want) as well as all of the data types needed (without windows_sys), so if that's wanted I can add that and open source the tool so the code can be regenerated/changed over time.

@riverar
Copy link

riverar commented Apr 20, 2023

@Jake-Shadle Is there something about the version churn claims that we're missing in microsoft/windows-rs#1720? We haven't seen any data to support your ecosystem impact claims. But do please throw something in there if we're missing something.

Any particular reason windows-bindgen isn't used here despite indicating you would? This is a switch to windows-sys, ignore.

Would be great to also get more context/data around why you are aggressively stripping windows crate dependencies at the expense of increasing crate maintanence, churn, and failure risk. Are these perhaps dependencies your projects are using and you're trying to trim the fat?

@Jake-Shadle
Copy link
Contributor Author

Version churn is still an issue, but removing windows is more about reducing compile times/download size/diskspace.

Would be great to also get more context/data around why you are aggressively stripping windows crate dependencies at the expense of #778 (comment), amodm/webbrowser-rs#62, and Amanieu/parking_lot#378. Are these perhaps dependencies your projects are using and you're trying to trim the fat?

Yes, I don't go around submitting PRs to crates we don't use.

  • Increasing crate maintenance - This is IMO not a big deal, yes it is more code in the crate, but the maintenance of that code is not even close to the same level as normal code because the API doesn't change. The only time the code would really meaningfully need to change would be adding support for some new Windows audio feature or interface, but that would presumably require regular work in the code calling into windows regardless.
  • Churn - The PR you linked had 1 small mistake that was easily fixed, the whole point of putting the bindings inside the crate is to reduce the churn of updating windows/-sys versions (though to be fair windows cadence is less bad than windows-sys)
  • Failure risk - I'm fine with using windows-targets, it doesn't impact compile times, but I also think the "failure risk" is exaggerated

Removing the dependency on windows for this crate and the others means we get rid of a 30+s of build time
image

The reason that time is so much longer than the times I stated above is because in a large workspace with hundreds of crates cargo will by default saturate the CPU, and thus crates with longer compile times tend to become even longer since they are competing with each other for CPU time as opposed to the isolated case of eg. just building cpal which can't saturate the CPU due to how few crates are built.

The situation is of course much better for windows-sys as unlike windows it doesn't actually have any implementation to compile/massive amount generics, but in a large enough project you are guaranteed to have multiple versions of windows-sys due to how often it bumps minor versions (we currently have 3). But still not great...

image

The compile time here is not the worst, but it has the negative impact that it blocks 23 crates from beginning compilation, and the compilation time can be far better since every feature enabled on windows-sys will by default include dozens to thousands of lines of type/struct/constant/function bindings of which a vast majority will not be used but are still paid for.

@Jake-Shadle Jake-Shadle changed the title Replace windows with windows-sys Replace windows with embedded bindings May 31, 2023
@Jake-Shadle Jake-Shadle marked this pull request as ready for review May 31, 2023 15:29
@Jake-Shadle
Copy link
Contributor Author

Jake-Shadle commented May 31, 2023

@est31 This PR now replaces windows with embedded bindings for everything instead of mixing manual COM interfaces and windows-sys bindings for types/functions.

The code is generated with a tool that I haven't published yet (source code here) as cpal was one of the crates I was using as a test case, but is otherwise just using windows-metadata the same as the windows/windows-sys crates to get and read the metadata used to generate the bindings.

While the embedded bindings may seem large at ~1120 lines, they are a minute fraction of code that would otherwise be compiled using either windows or windows-sys.

Summary for 11 namespaces
  records: 52 / 2765 => 1.88%
  functions: 9 / 2375 => 0.38%
  constants: 25 / 23085 => 0.11%
  aliases: 12 / 856 => 1.40%
  func pointers: 0 / 604 => 0.00%
  interfaces: 10 / 10 => 100.00%

PS: The windows CI failures happen on master, I'll rebase once/if they are fixed.

@MarijnS95
Copy link
Contributor

The code is generated with a tool that I haven't published yet (source code here)

@Jake-Shadle was this repo removed now that Microsoft feature-completed and published their official tooling that achieves the same?

And are you perhaps able to respin this PR using the official tooling? Here's what that looks like: Traverse-Research/amd-ext-d3d-rs@fa55fdb. Most important is bindings.txt and api_gen/src/main.rs.

@Jake-Shadle
Copy link
Contributor Author

Probably not, the official tooling configurability is too limited and the output has a lot of undesirable pieces that users can't control. I'm more likely going to end up going a different route and just generating a patch crate for windows/-sys since the patch process means I don't have to make PRs that people don't like.

@Jake-Shadle Jake-Shadle closed this Sep 6, 2023
@MarijnS95
Copy link
Contributor

Fair enough, the tooling had a bunch of push-back on useful features so it seems unlikely that any requested or contributed improvement makes it in.

In particular the generator emits all functions for a type, even those that depend on external modules which you will now also have to generate. There's the cfg option to emit cfg(feature) bounds but that is exactly what you don't want for generating embedded bindings: those internal guards shouldn't leak out to the public features interface in Cargo.toml.
Is that what you meant by "the output has a lot of undesirable pieces that users can't control"? Perhaps I should make an upstream issue about this.

Looking forward to seeing your patches/fork of those crates, keep me in the loop!

@Jake-Shadle
Copy link
Contributor Author

Yes, my bindings allow one to selectively emit particular methods and their tree of required types rather than all methods, there is no cfg output (other than the rare arch specific functions/structs), options to use rust casing for output items, improvements to enum output, options to not emit all of the completely unnecessary generic parameterization, etc.

I've given up making issues/patches to windows-rs, as it's clear they just have a fundamentally different view of how bindings should work, so good luck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plans for windows-rs?
5 participants