Skip to content

Releases: microsoft/mscclpp

MSCCL++ v0.5.1

26 May 21:32
cddffbc
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.5.0...v0.5.1

MSCCL++ v0.5.0

04 May 23:53
9c2a960
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.4.3...v0.5.0

MSCCL++ v0.4.3

27 Mar 18:55
1a7cb98
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.4.2...v0.4.3

MSCCL++ v0.4.2

20 Dec 12:25
f1605b7
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.4.1...v0.4.2

MSCCL++ v0.4.1

06 Dec 02:14
c15a166
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.4.0...v0.4.1

MSCCL++ v0.4.0

24 Nov 09:09
351b95b
Compare
Choose a tag to compare
  • Add Python benchmark
  • Update documentation
  • Add ROCm support
  • Bug fixes

See details from #160.

MSCCL++ v0.3.0

11 Oct 14:37
8c0f9e8
Compare
Choose a tag to compare
  • Updated interfaces
  • Add Python bindings and interfaces
  • Add Python unit tests
  • Add more configurable parameters
  • Add a new single-node AllReduce kernel
  • Fix bugs

See details from #89.

Full Changelog: v0.2.0...v0.3.0

MSCCL++ v0.2.0

11 Jul 03:03
2e16457
Compare
Choose a tag to compare

Communication Features and Interfaces

GPU-side communication interfaces (DeviceChannel)

    • Proxy-based Interfaces: ProxyChannel (#66)
    • In-SM Copy Interfaces: SmChannel (#55)
    • Packet Copy Interfaces: putPackets(), getPackets(), signalPacket() (#85, #90, #102)

Host-side interfaces

    • Bootstrap: fix socket performance issue & bugs (#92, #100, #113)
    • Communicator: implement (#66)

Transports support

    • NVLink: implement (#66)
    • InfiniBand: implement (#66)
    • InfiniBand: tackle memory consistency issues (#96)

Performance Optimization

    • Throughput: pass AllGather perf qualification (#77)
    • Throughput: pass AllReduce perf qualification (#83, #90)
    • Throughput: pass AllToAll perf qualification (#87)
    • Latency: pass AllReduce perf qualification (#85, #90)
    • Latency: pass 2-node AllReduce perf qualification (#109, #118)

Development Pipeline

    • Unit Tests: cover all interfaces (#81, #91)
    • mscclpp-test: add AllGather (#77)
    • mscclpp-test: add AllReduce (#83)
    • mscclpp-test: add AllToAll (#87)
    • CI: lint, spelling, CodeQL (#79)
    • CI: unit test (#81)
    • Package: publish Docker images (#104)

Documents

    • Doxygen: add configuration (#72)
    • README: enhance details (#88)
    • License: add license comments on all files (#106)
    • Code: cleanup & comments (#86, #119)

Full Changelog: https://github.com/microsoft/mscclpp/commits/v0.2.0

MSCCL++ v0.1.0

27 Mar 11:18
c706990
Compare
Choose a tag to compare
MSCCL++ v0.1.0 Pre-release
Pre-release

Features

  • Transport setup
    • Bootstrap (initial meta-data exchange between ranks)
    • Connection setup for P2P NVLink and InfiniBand
    • CPU proxies for P2P NVLink and InfiniBand
  • Transport interface
    • Trigger FIFO
    • put-signal-wait interface
  • Tests
    • AllToAll
    • AllGather based on AllToAll

Full Changelog: https://github.com/microsoft/mscclpp/commits/v0.1.0