Home Machine Learning How Quick Is MLX? A Complete Benchmark on 8 Apple Silicon Chips and 4 CUDA GPUs | by Tristan Bilot | Feb, 2024

How Quick Is MLX? A Complete Benchmark on 8 Apple Silicon Chips and 4 CUDA GPUs | by Tristan Bilot | Feb, 2024

0
How Quick Is MLX? A Complete Benchmark on 8 Apple Silicon Chips and 4 CUDA GPUs | by Tristan Bilot | Feb, 2024

[ad_1]

A benchmark of the primary operations and layers on MLX, PyTorch MPS and CUDA GPUs.

Picture by writer: Instance of benchmark on the softmax operation

In lower than two months since its first launch, Apple’s ML analysis staff’s newest creation, MLX, has already made vital strides within the ML neighborhood. It’s exceptional to see how shortly the brand new framework has garnered consideration, as evidenced by over 12k stars on GitHub and a rising neighborhood of over 500 members on Hugging Face 🤗.

In a earlier article, we demonstrated how MLX performs in coaching a easy Graph Convolutional Community (GCN), benchmarking it in opposition to numerous gadgets together with CPU, PyTorch’s MPS, and CUDA GPUs. The outcomes had been enlightening and confirmed the potential of MLX in working fashions effectively.

On this exploration, we delve deeper, getting down to benchmark a number of key operations generally leveraged in neural networks.

In our benchmark, every operation is evaluated primarily based on quite a lot of experiments, various in enter form and dimension. We’ve run these sequentially and a number of instances throughout completely different processes to make sure steady and dependable runtime measures.

Within the spirit of open collaboration, we’ve made the benchmark code open-source and straightforward to run. This permits contributors to simply add their very own benchmarks primarily based on their machine and config.

Word: many due to all contributors, with out whom this benchmark wouldn’t comprise as many baseline chips.

We efficiently ran this benchmark throughout 8 completely different Apple Silicon chips and 4 high-efficiency CUDA GPUs:

Apple Silicon: M1, M1 Professional, M2, M2 Professional, M2 Max, M2 Extremely, M3 Professional, M3 Max

CUDA GPU: RTX4090 16GB (Laptop computer), Tesla V100 32GB (NVLink), Tesla V100 32GB (PCIe), A100 80GB (PCIe).

[ad_2]