Lei Mao's Log Book
Lei Mao's Log BookCurriculumBlogArticlesProjectsPublicationsReadingsLifeEssayPhotographyArchivesCategoriesTagsFAQs

Grouped Query Attention Performance Theoretical Analysis

 02-03-2025 03-02-2025 blog 11 minutes read (About 1612 words)
Sharing Key and Value Tensors for a Group of Query Tensors to Mitigate Transformer Attention Layer Performance Bottleneck

 
Deep Learning, 
Neural Network, 
Transformer, 
Computer Architecture, 
Performance Optimization, 
Large Language Model  
  Read More

Fix NVIDIA Driver After Ubuntu Unattended Upgrade

 01-30-2025 01-30-2025 blog 2 minutes read (About 303 words)
A Quick and Safe Log for Fixing NVIDIA Driver

 
NVIDIA, 
Ubuntu, 
Driver  
  Read More

Transformer Vanilla Attention Performance Theoretical Analysis

 01-27-2025 03-02-2025 blog 9 minutes read (About 1275 words)
Performance Bottleneck for Serving Transformer Models

 
Deep Learning, 
Neural Network, 
Transformer, 
Computer Architecture, 
Performance Optimization, 
Large Language Model  
  Read More

iPad Battery Health

 01-22-2025 01-22-2025 blog 3 minutes read (About 498 words)
Check iPad Battery Remaining Capacity

 
Apple, 
iPad  
  Read More

CS2 Mouse Fix

 01-14-2025 01-14-2025 blog 4 minutes read (About 599 words)
Making Mouse Working Again In Counter-Strike 2

 
Game, 
Counter-Strike, 
CS, 
CS2, 
Mouse, 
Monitor, 
Steam  
  Read More

CuTe Tiled MMA

 01-09-2025 10-19-2025 blog 30 minutes read (About 4482 words)
Understanding CuTe Tiled MMA Using an Example

 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

NVIDIA GPU Compute Capability

 01-02-2025 03-21-2025 blog 15 minutes read (About 2230 words)
A Table of NVIDIA GPUs and Their Compute Capabilities

 
CUDA, 
NVIDIA, 
GPU  
  Read More

AWQ: Activation-Aware Weight Quantization

 01-01-2025 01-01-2025 blog 18 minutes read (About 2738 words)
Same Performance as Group-Wise Weight-Only Quantization But with Better Accuracy

 
Deep Learning, 
Mathematics, 
Quantization, 
Accelerated Computing, 
CUDA  
  Read More

NeurIPS 2024 Area Chair Experience

 12-26-2024 12-26-2024 blog 9 minutes read (About 1389 words)
First Time Serving as NeurIPS Area Chair

 
Deep Learning, 
NeurIPS, 
Conference  
  Read More

C++ Compile-Time Type Map

 12-22-2024 12-22-2025 blog 6 minutes read (About 921 words)
C++ Select Types Based On Template Types

 
CPP, 
CPP17, 
Metaprogramming  
  Read More

Ubuntu 24.04 LTS GUI File Operation Slowness

 12-14-2024 12-14-2024 blog a minute read (About 217 words)
Ubuntu 24.04 LTS GUI Severe Issue Workaround

 
Ubuntu  
  Read More

cuBLAS GEMM API Usages for Column-Major and Row-Major Matrices

 12-12-2024 12-12-2024 blog 7 minutes read (About 1012 words)
Calling cuBLAS GEMM API Correctly

 
Accelerated Computing, 
CUDA, 
cuBLAS  
  Read More
Previous
Next
  • 1
  • …
  • 5
  • 6
  • 7
  • …
  • 46
Lei Mao

Lei Mao

Artificial Intelligence Machine Learning Computer Science

Menlo Park, California

Posts

1234

Categories

8

Tags

766

  Follow   Sponsor

Advertisement


Categories

  • article20
  • blog547
  • essay312
  • life277
  • miscellaneous2
  • photography48
  • project20
  • reading8

follow.it

Recents

11-26-2025

Benchmarking NVIDIA Tensor Core MMA Instruction Peak Performances

blog

11-23-2025

Fix Bluetooth Not Found on Ubuntu Desktop

blog

11-23-2025

摄影过程中保护视力

essay

11-22-2025

Lower Guadalupe River Trail North

photography

11-22-2025

Ulistac Natural Area

photography

Archives

  • November 202520
  • October 202524
  • September 202515
  • August 202527
  • July 202523
  • See All >>

Tags

Outdoors282
Hiking216
California213
CPP116
Mathematics102
Deep Learning84
CUDA66
Photography61
Running56
Wildlife40
Bird36
Software Engineering36
Racing35
Machine Learning34
Python34
Movie32
Statistics32
Park31
Linux30
China29
See All >>
Lei Mao's Log Book

© 2017-2025 Lei Mao  Powered by Hexo & Icarus
Site UV:  Site PV:

×