Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "simd_support" feature #1056

Merged
merged 3 commits into from Oct 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 2 additions & 2 deletions Cargo.toml
Expand Up @@ -61,8 +61,8 @@ serde = { version = "1.0.103", features = ["derive"], optional = true }

[dependencies.packed_simd]
# NOTE: so far no version works reliably due to dependence on unstable features
version = "0.3"
# git = "https://github.com/rust-lang-nursery/packed_simd"
package = "packed_simd_2"
version = "0.3.4"
optional = true
features = ["into_bits"]

Expand Down
6 changes: 3 additions & 3 deletions src/distributions/integer.rs
Expand Up @@ -11,9 +11,9 @@
use crate::distributions::{Distribution, Standard};
use crate::Rng;
#[cfg(all(target_arch = "x86", feature = "simd_support"))]
use core::arch::x86::{__m64, __m128i, __m256i};
use core::arch::x86::{__m128i, __m256i};
#[cfg(all(target_arch = "x86_64", feature = "simd_support"))]
use core::arch::x86_64::{__m64, __m128i, __m256i};
use core::arch::x86_64::{__m128i, __m256i};
#[cfg(not(target_os = "emscripten"))] use core::num::NonZeroU128;
use core::num::{NonZeroU16, NonZeroU32, NonZeroU64, NonZeroU8, NonZeroUsize};
#[cfg(feature = "simd_support")] use packed_simd::*;
Expand Down Expand Up @@ -158,7 +158,7 @@ simd_impl!(512, u8x64, i8x64, u16x32, i16x32, u32x16, i32x16, u64x8, i64x8,);
feature = "simd_support",
any(target_arch = "x86", target_arch = "x86_64")
))]
simd_impl!((__m64, u8x8), (__m128i, u8x16), (__m256i, u8x32),);
simd_impl!((__m128i, u8x16), (__m256i, u8x32),);

#[cfg(test)]
mod tests {
Expand Down
5 changes: 1 addition & 4 deletions src/distributions/utils.rs
Expand Up @@ -159,9 +159,8 @@ mod simd_wmul {
}

wmul_impl! { (u16x2, u32x2),, 16 }
#[cfg(not(target_feature = "sse2"))]
wmul_impl! { (u16x4, u32x4),, 16 }
#[cfg(not(target_feature = "sse4.2"))]
#[cfg(not(target_feature = "sse2"))]
wmul_impl! { (u16x8, u32x8),, 16 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is a 256-bit op and thus requires AVX?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think packed_simd provides a fallback implementation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it?

Can you at least clarify on what basis you decided to make this change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without 128-bit XMM multiplications, packed_simd falls back to a scalar implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, why do we bother with these cfg selectors at all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is to provide a specialized implementation that uses less instructions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're saying the use of a SSE4.2 target originally was a mistake?

Maybe it was, since the insrtuctions appear to be supported on SSE2: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_mullo_epi16&expand=3967,3990

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appears to be a mistake. I guess that was my bad

#[cfg(not(target_feature = "avx2"))]
wmul_impl! { (u16x16, u32x16),, 16 }
Expand All @@ -187,8 +186,6 @@ mod simd_wmul {
}

#[cfg(target_feature = "sse2")]
wmul_impl_16! { u16x4, __m64, _mm_mulhi_pu16, _mm_mullo_pi16 }
#[cfg(target_feature = "sse4.2")]
wmul_impl_16! { u16x8, __m128i, _mm_mulhi_epu16, _mm_mullo_epi16 }
#[cfg(target_feature = "avx2")]
wmul_impl_16! { u16x16, __m256i, _mm256_mulhi_epu16, _mm256_mullo_epi16 }
Expand Down