compiler-builtins optimization #113

Closed
opened 2020-10-13 21:50:36 +08:00 by sb10q · 3 comments
* Currently disabled because the cargo-xbuild patch hack causes a lot of problems with rust upgrades * Re-enable after clean solution based on https://github.com/rust-lang/compiler-builtins/pull/385
Poster
Owner

https://nixbld.m-labs.hk/build/95406/log
https://nixbld.m-labs.hk/build/94848/log

Without optimization:

test_async_throughput (artiq.test.coredevice.test_performance.TransferTest) ... Async throughput:  47.34MiB/s

| Test                 | Mean (MiB/s) |  std (MiB/s) |
| -------------------- | ------------ | ------------ |
| I32 Array (1MB) H2D  |        49.68 |         0.77 |
| I32 Array (1MB) D2H  |        28.29 |         4.90 |
| I32 Array (1KB) H2D  |         5.55 |         1.02 |
| I32 Array (1KB) D2H  |         5.53 |         0.54 |
| Bytes List (1MB) H2D |        41.08 |         2.76 |
| Bytes List (1MB) D2H |        28.67 |         2.96 |
| Bytes List (1KB) H2D |         5.28 |         0.97 |
| Bytes List (1KB) D2H |         5.29 |         0.31 |
| Bytes (1MB) H2D      |        60.12 |         0.71 |
| Bytes (1MB) D2H      |        36.06 |         0.25 |
| Bytes (1KB) H2D      |         5.45 |         0.98 |
| Bytes (1KB) D2H      |         5.45 |         0.35 |
| I32 List (1MB) H2D   |        41.39 |         1.01 |
| I32 List (1MB) D2H   |        26.52 |         4.48 |
| I32 List (1KB) H2D   |         5.12 |         0.87 |
| I32 List (1KB) D2H   |         5.45 |         0.26 |

With optimization:

test_async_throughput (artiq.test.coredevice.test_performance.TransferTest) ... Async throughput:  67.10MiB/s
| Test                 | Mean (MiB/s) |  std (MiB/s) |
| -------------------- | ------------ | ------------ |
| I32 Array (1MB) H2D  |        64.50 |         1.15 |
| I32 Array (1MB) D2H  |        44.16 |         0.59 |
| I32 Array (1KB) H2D  |         6.01 |         1.02 |
| I32 Array (1KB) D2H  |         5.52 |         0.73 |
| Bytes List (1MB) H2D |        53.55 |         0.85 |
| Bytes List (1MB) D2H |        46.10 |         1.78 |
| Bytes List (1KB) H2D |         5.35 |         1.17 |
| Bytes List (1KB) D2H |         6.00 |         1.07 |
| Bytes (1MB) H2D      |        82.64 |         5.04 |
| Bytes (1MB) D2H      |        60.00 |         1.01 |
| Bytes (1KB) H2D      |         5.53 |         1.12 |
| Bytes (1KB) D2H      |         5.54 |         0.65 |
| I32 List (1MB) H2D   |        51.13 |         3.13 |
| I32 List (1MB) D2H   |        39.40 |         1.11 |
| I32 List (1KB) H2D   |         5.85 |         1.26 |
| I32 List (1KB) D2H   |         6.33 |         0.82 |
https://nixbld.m-labs.hk/build/95406/log https://nixbld.m-labs.hk/build/94848/log Without optimization: ```text test_async_throughput (artiq.test.coredevice.test_performance.TransferTest) ... Async throughput: 47.34MiB/s | Test | Mean (MiB/s) | std (MiB/s) | | -------------------- | ------------ | ------------ | | I32 Array (1MB) H2D | 49.68 | 0.77 | | I32 Array (1MB) D2H | 28.29 | 4.90 | | I32 Array (1KB) H2D | 5.55 | 1.02 | | I32 Array (1KB) D2H | 5.53 | 0.54 | | Bytes List (1MB) H2D | 41.08 | 2.76 | | Bytes List (1MB) D2H | 28.67 | 2.96 | | Bytes List (1KB) H2D | 5.28 | 0.97 | | Bytes List (1KB) D2H | 5.29 | 0.31 | | Bytes (1MB) H2D | 60.12 | 0.71 | | Bytes (1MB) D2H | 36.06 | 0.25 | | Bytes (1KB) H2D | 5.45 | 0.98 | | Bytes (1KB) D2H | 5.45 | 0.35 | | I32 List (1MB) H2D | 41.39 | 1.01 | | I32 List (1MB) D2H | 26.52 | 4.48 | | I32 List (1KB) H2D | 5.12 | 0.87 | | I32 List (1KB) D2H | 5.45 | 0.26 | ``` With optimization: ```text test_async_throughput (artiq.test.coredevice.test_performance.TransferTest) ... Async throughput: 67.10MiB/s | Test | Mean (MiB/s) | std (MiB/s) | | -------------------- | ------------ | ------------ | | I32 Array (1MB) H2D | 64.50 | 1.15 | | I32 Array (1MB) D2H | 44.16 | 0.59 | | I32 Array (1KB) H2D | 6.01 | 1.02 | | I32 Array (1KB) D2H | 5.52 | 0.73 | | Bytes List (1MB) H2D | 53.55 | 0.85 | | Bytes List (1MB) D2H | 46.10 | 1.78 | | Bytes List (1KB) H2D | 5.35 | 1.17 | | Bytes List (1KB) D2H | 6.00 | 1.07 | | Bytes (1MB) H2D | 82.64 | 5.04 | | Bytes (1MB) D2H | 60.00 | 1.01 | | Bytes (1KB) H2D | 5.53 | 1.12 | | Bytes (1KB) D2H | 5.54 | 0.65 | | I32 List (1MB) H2D | 51.13 | 3.13 | | I32 List (1MB) D2H | 39.40 | 1.11 | | I32 List (1KB) H2D | 5.85 | 1.26 | | I32 List (1KB) D2H | 6.33 | 0.82 | ```

The compiler-builtin crate is recently updated, maybe we can update the dependencies and see how the benchmark is.

I'm not sure how to update the channel-rust-nightly.toml so I cannot update the dependency.

The compiler-builtin crate is recently updated, maybe we can update the dependencies and see how the benchmark is. I'm not sure how to update the `channel-rust-nightly.toml` so I cannot update the dependency.
Poster
Owner

wget https://static.rust-lang.org/dist/channel-rust-nightly.toml

``wget https://static.rust-lang.org/dist/channel-rust-nightly.toml``
sb10q closed this issue 2021-01-15 18:00:47 +08:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/artiq-zynq#113
There is no content yet.