Lei Mao's Log Book
Lei Mao's Log BookCurriculumBlogArticlesProjectsPublicationsReadingsLifeEssayPhotographyArchivesCategoriesTagsFAQs
  • Tags
  • Accelerated Computing

CuTe Arithmetic Tuple Tensor

 10-20-2025 10-20-2025 blog 16 minutes read (About 2388 words)
The Tensor Coordinate Generator In CuTe

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Tiled Copy

 10-16-2025 10-16-2025 blog 28 minutes read (About 4216 words)
Understanding CuTe Tiled Copy

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Thread-Value Layout

 10-13-2025 10-13-2025 blog 6 minutes read (About 957 words)
CuTe TV Layout, Inverse TV Layout, and TV Partition

 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe ldmatrix

 10-03-2025 10-03-2025 blog 22 minutes read (About 3357 words)
CUDA PTX ldmatrix Instruction and Its CuTe Wrapper

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Tilers

 09-15-2025 09-15-2025 blog 10 minutes read (About 1524 words)
Designing Tilers for Data Access

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Inverse Layout

 08-13-2025 08-13-2025 blog 9 minutes read (About 1390 words)
Deriving Inverse Layout Mathematically

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Blocked and Raked Products

 08-07-2025 08-07-2025 blog 9 minutes read (About 1283 words)
Creating Tiled Layouts Using Blocked Product and Raked Product

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Local Tile

 08-01-2025 08-01-2025 blog 6 minutes read (About 865 words)
Elucidating CuTe Inner Partition and Local Tile

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Local Partition

 07-25-2025 08-01-2025 blog 15 minutes read (About 2291 words)
Elucidating CuTe Outer Partition and Local Partition

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

CuTe Index To Coordinate

 07-19-2025 07-19-2025 blog 14 minutes read (About 2040 words)
Inverse Layout Function

 
Mathematics, 
Accelerated Computing, 
CUDA, 
CUTLASS, 
CuTe  
  Read More

Online Safe Softmax

 06-23-2025 06-23-2025 blog 5 minutes read (About 741 words)
Safe and Efficient Online Softmax Calculation

 
Deep Learning, 
Mathematics, 
Accelerated Computing  
  Read More

Roofline Performance Model

 03-26-2025 03-26-2025 blog 7 minutes read (About 1078 words)
Understand the Performance Limitations and Gaps

 
Accelerated Computing, 
High Performance Computing, 
Computer Architecture, 
Performance  
  Read More
Previous
Next
  • 1
  • 2
  • 3
Lei Mao

Lei Mao

Artificial Intelligence Machine Learning Computer Science

Menlo Park, California

Posts

1222

Categories

8

Tags

758

  Follow   Sponsor

Advertisement


Categories

  • article20
  • blog543
  • essay310
  • life274
  • miscellaneous2
  • photography45
  • project20
  • reading8

follow.it

Recents

11-10-2025

Nike Pegasus Premium 非专业评测

essay

11-09-2025

Image Processing Priorities: Resolution VS Quality

blog

11-08-2025

Foothills Nature Preserve

photography

11-08-2025

Foothills Nature Preserve 徒步

life

11-04-2025

Nsight Streamer

blog

Archives

  • November 20258
  • October 202524
  • September 202515
  • August 202527
  • July 202523
  • See All >>

Tags

Outdoors279
Hiking213
California210
CPP114
Mathematics102
Deep Learning84
CUDA65
Running56
Photography55
Wildlife37
Bird36
Software Engineering36
Racing35
Machine Learning34
Python34
Movie32
Statistics32
Park31
Linux30
China29
See All >>
Lei Mao's Log Book

© 2017-2025 Lei Mao  Powered by Hexo & Icarus
Site UV:  Site PV:

×