Lei Mao's Log Book
Lei Mao's Log BookCurriculumBlogArticlesProjectsPublicationsReadingsLifeEssayPhotographyArchivesCategoriesTagsFAQs

Benchmarking NVIDIA Tensor Core MMA Instruction Peak Performances

 11-26-2025 11-26-2025 blog 11 minutes read (About 1646 words)
Reproducing NVIDIA Advertised GPU AI Peak Performances Using CUTLASS and CuTe

 
CPP, 
CUDA, 
NVIDIA, 
CUTLASS, 
CuTe, 
MMA, 
GEMM, 
Tensor Core  
  Read More

Fix Bluetooth Not Found on Ubuntu Desktop

 11-23-2025 11-23-2025 blog a minute read (About 202 words)
Troubleshooting Bluetooth Not Found Issues

 
Ubuntu, 
Bluetooth, 
Personal Computer  
  Read More

Focus Breathing and Compensation

 11-21-2025 11-21-2025 blog 5 minutes read (About 776 words)
Physics and Mathematics Behind Focus Breathing and Compensation

 
Physics, 
Camera, 
Photography, 
Videography  
  Read More

Core Dump and GDB

 11-15-2025 11-15-2025 blog 7 minutes read (About 1029 words)
Analyzing Core Dump Files Using GDB

 
CPP, 
GDB, 
Core Dump  
  Read More

Image Processing Priorities: Resolution VS Quality

 11-09-2025 11-09-2025 blog 13 minutes read (About 2013 words)
Finding The Sweet Spots Between Image Resolution, JPEG Quality, and File Size

 
Photography, 
JPEG, 
Image Processing  
  Read More

Nsight Streamer

 11-04-2025 11-04-2025 blog 3 minutes read (About 515 words)
Nsight Systems and Nsight Compute GUIs In a Web Browser

 
CUDA, 
NVIDIA, 
Nsight Compute, 
Nsight Systems, 
Nsight Streamer  
  Read More

One-Pass Naive Algorithm for Computing Variance

 10-29-2025 10-29-2025 blog 6 minutes read (About 899 words)
Caveats and Tricks

 
Statistics, 
Numerical Stability, 
Algorithm  
  Read More

CuTe Arithmetic Tuple Tensor

 10-20-2025 10-20-2025 blog 16 minutes read (About 2388 words)
The Tensor Coordinate Generator In CuTe

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Tiled Copy

 10-16-2025 10-16-2025 blog 28 minutes read (About 4216 words)
Understanding CuTe Tiled Copy

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Thread-Value Layout

 10-13-2025 10-13-2025 blog 6 minutes read (About 955 words)
CuTe TV Layout, Inverse TV Layout, and TV Partition

 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

Setting Up Environment Variables In SSH Sessions Over TCP On Runpod

 10-10-2025 10-10-2025 blog 12 minutes read (About 1785 words)
Fixing a Environment Variables Issue for Runpod

 
CUDA, 
NVIDIA, 
Docker, 
GPU, 
Cloud Computing, 
Runpod, 
IDE, 
SSH  
  Read More

Setting Up Remote Development Using Custom Template On Runpod

 10-08-2025 10-13-2025 blog 12 minutes read (About 1814 words)
Custom Remote Development Using GPUs on Runpod

 
CUDA, 
NVIDIA, 
Docker, 
GPU, 
Cloud Computing, 
Runpod, 
IDE, 
SSH  
  Read More
Previous
Next
  • 1
  • 2
  • 3
  • …
  • 47
Lei Mao

Lei Mao

Artificial Intelligence Machine Learning Computer Science

Menlo Park, California

Posts

1300

Categories

8

Tags

792

  Follow   Sponsor

Advertisement


Categories

  • article21
  • blog559
  • essay328
  • life298
  • miscellaneous2
  • photography64
  • project20
  • reading8

follow.it

Recents

02-16-2026

System Performance Optimizations

article

02-14-2026

QQ 幻想

essay

02-14-2026

2026 Brazen Bay Breeze 5K 竞赛

life

02-13-2026

CUDA Shared Memory Bank Conflict-Free Vectorized Access

blog

02-08-2026

Dota 闪电站出售

essay

Archives

  • February 20269
  • January 202616
  • December 202535
  • November 202525
  • October 202524
  • See All >>

Tags

Outdoors303
California234
Hiking232
CPP120
Mathematics102
Deep Learning84
Photography78
CUDA71
Running62
Wildlife55
Bird49
Racing40
Python36
Software Engineering36
Machine Learning34
Movie33
Statistics32
NVIDIA31
Park31
China30
See All >>
Lei Mao's Log Book

© 2017-2026 Lei Mao  Powered by Hexo & Icarus
Site UV:  Site PV:

×