PyTorch Eager Mode Quantization TensorRT Acceleration 05-24-2024 05-24-2024 blog 7 minutes read (About 1051 words)TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models Deep Learning, Python, Inference, Quantization, Accelerated Computing, NVIDIA, TensorRT, PyTorch, GPU Read More
CUDA Matrix Multiplication Optimization 01-20-2024 01-20-2024 article 2 hours read (About 19282 words)General Matrix Multiplication CUDA Performance Optimization CPP, Accelerated Computing, CUDA, NVIDIA Read More
CUDA Tensor Layouts for Convolution 06-04-2023 06-04-2023 blog 13 minutes read (About 1960 words)Motivations for Different Tensor Layouts Accelerated Computing, CUDA Read More
NVIDIA Tensor Core Programming 05-18-2023 12-27-2023 blog 28 minutes read (About 4243 words)Fast Matrix Multiplication and Accumulation on GPU CPP, Accelerated Computing, CUDA, NVIDIA Read More
Moore's Law 04-10-2023 04-10-2023 blog 7 minutes read (About 1085 words)Moore's Law Is Dead. What's Next? Accelerated Computing, GPU, CPU Read More
Transformer Autoregressive Inference Optimization 04-06-2023 04-06-2023 article 27 minutes read (About 4084 words)Principles for Faster Transformer Inference Deep Learning, Inference, Natural Language Processing, Optimization, Transformer, Accelerated Computing Read More
Strassen Algorithm 01-13-2023 01-13-2023 blog 7 minutes read (About 1016 words)Asymptotically Faster Matrix Multiplication Algorithm Computer Science, Accelerated Computing, Algorithm Read More
CSR Sparse Matrix Multiplication 12-21-2022 12-21-2022 blog 13 minutes read (About 1886 words)Accelerate Sparse Matrix Multiplication Using CSR Format Accelerated Computing Read More
CUDA Matrix Multiplication 03-21-2022 03-04-2023 blog 32 minutes read (About 4790 words)Implement Matrix Multiplication and Batched Matrix Multiplication Using CUDA CPP, Accelerated Computing, CUDA Read More