Lei Mao's Log Book
Lei Mao's Log BookCurriculumBlogArticlesProjectsPublicationsReadingsLifeEssayPhotographyArchivesCategoriesTagsFAQs

C++ Latch and Barrier

 02-06-2026 02-06-2026 blog 8 minutes read (About 1154 words)
Scheduling and Synchronizing Threads Using std::latch and std::barrier

 
CPP, 
Multithreading, 
Parallel Programming  
  Read More

CUDA Rendezvous Stream

 01-26-2026 01-26-2026 blog 11 minutes read (About 1690 words)
Simplifying Synchronization Complexities Using CUDA Rendezvous Streams

 
CUDA, 
NVIDIA, 
Parallel Computing, 
GPU  
  Read More

Randomized SVD

 01-19-2026 01-19-2026 blog 12 minutes read (About 1749 words)
Efficient Approximation of Singular Value Decomposition Using Random Projections

 
Linear Algebra, 
SVD, 
Randomized SVD  
  Read More

PyTorch CUDA Graph Capture

 01-12-2026 01-12-2026 blog 23 minutes read (About 3454 words)
Using PyTorch CUDA Graph APIs

 
CUDA, 
PyTorch, 
CUDA Graph, 
Perfetto  
  Read More

Disqus Affiliate Links URL Hijacking

 01-06-2026 01-06-2026 blog 3 minutes read (About 407 words)
URL Hijacking Caused By Third-Party Service

 
Disqus, 
Web Security  
  Read More

Inspecting and Visualizing Torch FX Graph

 12-31-2025 12-31-2025 blog 13 minutes read (About 1882 words)
Torch FxGraphDrawer

 
Python, 
PyTorch, 
Torch FX  
  Read More

NVIDIA NVML GPU Statistics

 12-25-2025 12-25-2025 blog 15 minutes read (About 2214 words)
Mimicking nvidia-smi dmon Using NVIDIA NVML

 
CPP, 
CUDA, 
NVIDIA, 
GPU, 
NVML  
  Read More

Radix Sort

 12-18-2025 12-18-2025 blog 19 minutes read (About 2808 words)
A Non-Comparative Sorting Algorithm

 
CPP, 
Python, 
Algorithm  
  Read More

Install NVIDIA RTX 5080

 12-10-2025 12-10-2025 blog 5 minutes read (About 703 words)
Installing NVIDIA RTX 5080 on an Old Desktop

 
Ubuntu, 
NVIDIA, 
GPU  
  Read More

NVIDIA Tensor Core TN Layout MMA Instruction

 12-06-2025 12-06-2025 blog 16 minutes read (About 2389 words)
GEMM Layout, History, Performance, and Implementation

 
CPP, 
CUDA, 
NVIDIA, 
CUTLASS, 
CuTe, 
MMA, 
GEMM, 
Tensor Core  
  Read More

Replacing Thinkpad X1 Yoga CMOS Battery

 12-02-2025 12-02-2025 blog 3 minutes read (About 377 words)
The First Time I Replaced a CMOS Battery on a Computer

 
Thinkpad, 
CMOS, 
DIY  
  Read More

Benchmarking NVIDIA Tensor Core MMA Instruction Peak Performances

 11-26-2025 11-26-2025 blog 11 minutes read (About 1646 words)
Reproducing NVIDIA Advertised GPU AI Peak Performances Using CUTLASS and CuTe

 
CPP, 
CUDA, 
NVIDIA, 
CUTLASS, 
CuTe, 
MMA, 
GEMM, 
Tensor Core  
  Read More
Previous
Next
  • 1
  • 2
  • 3
  • …
  • 48
Lei Mao

Lei Mao

Artificial Intelligence Machine Learning Computer Science

Menlo Park, California

Posts

1345

Categories

8

Tags

810

  Follow   Sponsor

Advertisement


Categories

  • article21
  • blog570
  • essay342
  • life311
  • miscellaneous2
  • photography71
  • project20
  • reading8

follow.it

Recents

04-30-2026

2026 年 3 月和 4 月该入手的模型手办

essay

04-29-2026

Docker Container GUI Display Using Wayland

blog

04-26-2026

马拉松破二

essay

04-25-2026

2026 Heart & Soles Run 5K 竞赛

life

04-22-2026

How Is FARS, The Fully Automated Research System?

blog

Archives

  • April 202618
  • March 202618
  • February 202617
  • January 202616
  • December 202536
  • See All >>

Tags

Outdoors316
California247
Hiking239
CPP121
Mathematics102
Deep Learning86
Photography85
CUDA74
Running70
Wildlife62
Bird56
Racing46
Movie37
Python36
Software Engineering36
Machine Learning34
Linux32
NVIDIA32
Statistics32
China31
See All >>
Lei Mao's Log Book

© 2017-2026 Lei Mao  Powered by Hexo & Icarus
Site UV:  Site PV:

×