CuTe Index To Coordinate
Introduction
In CuTe programming, sometimes we would like to convert an index to a coordinate. This requires an inverse layout function, which is generally not straightforward to derive. However, if we could assume the layout is compact, we can derive the inverse layout function from the coordinate isomorphism.
In this article, I would like to discuss the derivation of the inverse layout function of a compact layout and its analogy to the coordinate isomorphism mathematically.
Prerequisites
To understand how to convert an index to a coordinate in the CuTe, we have to be familiar with the coordinate isomorphism and the layout function, which are two common concepts in the CuTe layout algebra. I have discussed these in my previous article “CuTe Layout Algebra” and I have copied the relevant introductions to these concepts for convenience because they can be read standalone.
Coordinate Isomorphism
Let $\mathbf{S} = (M_0, M_1, \ldots, M_{\alpha})$ and $\mathbf{D} = (d_0, d_1, \ldots, d_{\alpha})$ be the respective shape and stride tuples of $L = \mathbf{S}:\mathbf{D}$. Let $M = M_0 \cdot M_1 \cdot \ldots \cdot M_{\alpha}$ be the size of $L$ and let $[0, M) \subset \mathbb{N}$ be the subset of the natural numbers given by ${0, 1, 2, \ldots, M - 1}$. Then we have an isomorphism
$$
\begin{align}
\iota: [0, M) \cong [0, M_0) \times [0, M_1) \times \ldots \times [0, M_{\alpha}) \\
\end{align}
$$
Given any $x \in [0, M)$, the isomorphism $\iota$ maps $x$ to the tuple
$$
\begin{align}
x \mapsto \left(x \mod M_0, \left\lfloor \frac{x}{M_0} \right\rfloor \mod M_1, \ldots, \left\lfloor \frac{x}{M_0 \cdot M_1 \cdot \ldots \cdot M_{\alpha - 1}} \right\rfloor \mod M_{\alpha}\right)
\end{align}
$$
The isomorphism mapping is bijective. In our case, given any tuple $(x_0, x_1, \ldots, x_{\alpha}) \in [0, M_0) \times [0, M_1) \times \ldots \times [0, M_{\alpha})$, the isomorphism inverse maps the tuple to the integer
$$
\begin{align}
\left(x_0, x_1, \ldots, x_{\alpha}\right) \mapsto x_0 + x_1 \cdot M_0 + x_2 \cdot M_0 \cdot M_1 + \ldots + x_{\alpha} \cdot M_0 \cdot M_1 \cdot \ldots \cdot M_{\alpha - 1}
\end{align}
$$
It’s straightforward to verify the above isomorphism mapping is valid and proof the above isomorphism mapping is bijective (by contradiction).
One could imagine the isomorphism as a mapping between a one-dimensional coordinate and a multi-dimensional coordinate.
Layout Function
Given a layout $L$, its layout function is the function $f_L: [0, M) \to \mathbb{N}$ is defined to be the composite
$$
\begin{align}
[0, M) \cong [0, M_0) \times [0, M_1) \times \ldots \times [0, M_{\alpha}) \subset \mathbb{N}^{\times (\alpha + 1)} \xrightarrow{\cdot d_0, \cdot d_1, \ldots, \cdot d_{\alpha}} \mathbb{N}^{\times (\alpha + 1)} \xrightarrow{+} \mathbb{N}
\end{align}
$$
In other words, $f_L$ is the composition of the multilinear function
$$
\begin{align}
[0, M_0) \times [0, M_1) \times \ldots \times [0, M_{\alpha}) \to \mathbb{N} \\
(x_0, x_1, \ldots, x_{\alpha}) \mapsto x_0 \cdot d_0 + x_1 \cdot d_1 + \ldots + x_{\alpha} \cdot d_{\alpha}
\end{align}
$$
determined by the stride, with the isomorphism $\iota$, determined by the shape.
Computing the value of a layout function $f_L$ at a point $x \in [0, M)$ can be decomposed into computing the sum of the values of the layout function at multiple points. This is sometimes useful for computing the value of the layout function at a point handily.
Given a layout $L = (M_0, M_1, \ldots, M_{\alpha}):(d_0, d_1, \ldots, d_{\alpha})$ and $x \in [0, M)$,
$$
\begin{align}
x \mapsto \left(x_0, x_1, \ldots, x_{\alpha}\right) \mapsto x_0 \cdot d_0 + x_1 \cdot d_1 + \ldots + x_{\alpha} \cdot d_{\alpha}
\end{align}
$$
We also have
$$
\begin{align}
x^{\prime}_{0} &\mapsto \left(x_0, 0, 0, \ldots, 0\right) \mapsto x_0 \cdot d_0 \\
x^{\prime}_{1} &\mapsto \left(0, x_1, 0, \ldots, 0\right) \mapsto x_1 \cdot d_1 \\
&\vdots \\
x^{\prime}_{\alpha} &\mapsto \left(0, 0, 0, \ldots, x_{\alpha}\right) \mapsto x_{\alpha} \cdot d_{\alpha}
\end{align}
$$
Therefore, we have
$$
\begin{align}
f_L(x) = f_L(x^{\prime}_{0}) + f_L(x^{\prime}_{1}) + \ldots + f_L(x^{\prime}_{\alpha})
\end{align}
$$
where
$$
\begin{align}
x^{\prime}_{0} &= x \mod M_0 \\
x^{\prime}_{1} &= \left\lfloor \frac{x}{M_0} \right\rfloor \mod M_1 \cdot M_0 \\
&\vdots \\
x^{\prime}_{\alpha} &= \left\lfloor \frac{x}{M_0 \cdot M_1 \cdot \ldots \cdot M_{\alpha - 1}} \right\rfloor \mod M_{\alpha} \cdot M_0 \cdot M_1 \cdot \ldots \cdot M_{\alpha - 1}
\end{align}
$$
For example, given a layout $L = (3, 2):(2, 3)$ and $x = 5$, we have
$$
\begin{align}
f_L(5) &= f_L(5 \mod 3) + f_L\left(\left\lfloor \frac{5}{3} \right\rfloor \mod 2 \cdot 3\right) \\
&= f_L(2) + f_L(3) \\
&= 2 \cdot 2 + \left\lfloor \frac{3}{3} \right\rfloor \cdot 3 \\
&= 4 + 3 \\
&= 7
\end{align}
$$
Inverse Layout Function
The layout function essentially computes a dot product of the coordinate tuple with the stride tuple. The inverse of dot product, i.e., computing the coordinate tuple using the stride tuple and the value of layout function, is not straightforward. However, if we could assume the layout function is “compact” or injective, we can compute the coordinate tuple using the value of layout function and the stride tuple. In this situation, computing coordinate tuple from index is somewhat similar to computing multi-dimensional coordinate from one-dimensional coordinate in the coordinate isomorphism.
Naive Algorithm and Implementation
Let $\mathbf{S} = (M_0, M_1, \ldots, M_{\alpha})$ and $\mathbf{D} = (d_0, d_1, \ldots, d_{\alpha})$ be the respective shape and stride tuples of the layout $L = \mathbf{S}:\mathbf{D}$. Given its coordinate tuple $\mathbf{x} = (x_0, x_1, \ldots, x_{\alpha})$, we have the index mapping $f_L{(\mathbf{x})} = f_L(x_0, x_1, \ldots, x_{\alpha}) = x_0 \cdot d_0 + x_1 \cdot d_1 + \ldots + x_{\alpha} \cdot d_{\alpha}$. We could sort the layout $L$ by the stride tuple $\mathbf{D}$ so that $\mathbf{S}^{\prime} = (M_0^{\prime}, M_1^{\prime}, \ldots, M_{\alpha}^{\prime})$ and $\mathbf{D}^{\prime} = (d_0^{\prime}, d_1^{\prime}, \ldots, d_{\alpha}^{\prime})$ be the respective shape and stride tuples of the sorted $L^{\prime} = \mathbf{S}^{\prime}:\mathbf{D}^{\prime}$, where $d_0^{\prime} < d_1^{\prime} < \ldots < d_{\alpha}^{\prime}$. The permuted index has to be stored so that we could map the permuted tuples back to the original ones. The permuted coordinate tuple becomes $\mathbf{x}^{\prime} = (x_0^{\prime}, x_1^{\prime}, \ldots, x_{\alpha}^{\prime})$. The index mapping value, however, remains the same, i.e., $f_{L^{\prime}}{(\mathbf{x}^{\prime})} = f_{L^{\prime}}{(x_0^{\prime}, x_1^{\prime}, \ldots, x_{\alpha}^{\prime})} = x_0^{\prime} \cdot d_0^{\prime} + x_1^{\prime} \cdot d_1^{\prime} + \ldots + x_{\alpha}^{\prime} \cdot d_{\alpha}^{\prime} = f_L{(\mathbf{x})}$.
Assuming the layout $L$ is compact, we must have $d_0^{\prime} = 1$, $d_1 = M_0^{\prime}$, $d_2 = M_0^{\prime} \cdot M_1^{\prime}$, $\ldots$, $d_{\alpha} = M_0^{\prime} \cdot M_1^{\prime} \cdot \ldots \cdot M_{\alpha - 1}^{\prime}$. Therefore, the index mapping can be rewritten as
$$
\begin{align}
f_L{(\mathbf{x})}
&= x_0^{\prime} \cdot d_0^{\prime} + x_1^{\prime} \cdot d_1^{\prime} + x_2^{\prime} \cdot d_2^{\prime} + \ldots + x_{\alpha}^{\prime} \cdot d_{\alpha}^{\prime} \\
&= x_0^{\prime} + x_1^{\prime} \cdot M_0^{\prime} + x_2^{\prime} \cdot M_0^{\prime} \cdot M_1^{\prime} + \ldots + x_{\alpha}^{\prime} \cdot M_0^{\prime} \cdot M_1^{\prime} \cdot \ldots \cdot M_{\alpha - 1}^{\prime}
\end{align}
$$
This is exactly the same as the coordinate isomorphism we described earlier. Given an index $f_L{(\mathbf{x})}$ and the sorted stride tuple $\mathbf{D}^{\prime} = (d_0^{\prime}, d_1^{\prime}, \ldots, d_{\alpha}^{\prime})$, we can compute the coordinate tuple $\mathbf{x}^{\prime} = (x_0^{\prime}, x_1^{\prime}, \ldots, x_{\alpha}^{\prime})$ as follows:
$$
\begin{align}
\mathbf{x}^{\prime} &= (x_0^{\prime}, x_1^{\prime}, \ldots, x_{\alpha}^{\prime}) \\
&= \left(f_L{(\mathbf{x})} \mod M_0^{\prime}, \left\lfloor \frac{f_L{(\mathbf{x})}}{M_0^{\prime}} \right\rfloor \mod M_1^{\prime}, \ldots, \left\lfloor \frac{f_L{(\mathbf{x})}}{M_0^{\prime} \cdot M_1^{\prime} \cdot \ldots \cdot M_{\alpha - 1}^{\prime}} \right\rfloor \mod M_{\alpha}^{\prime}\right)
\end{align}
$$
Once we have computed the permuted coordinate tuple $\mathbf{x}^{\prime}$, because we have stored the sorted index, we can map the permuted coordinate tuple back to the original coordinate tuple $\mathbf{x} = (x_0, x_1, \ldots, x_{\alpha})$ using the sorted index.
An Improved Implementation
The inverse layout function implemented in CuTe, idx2crd
, actually requires no sorting of the layout.
1 | /** idx2crd(i,s,d) splits an index into a coordinate within <Shape,Stride>. |
We notice that the permuted coordinate tuple formula could be rewritten in a way that requires no sorting and product accumulation of the shape tuple, because the product accumulation of the shape tuple is already stored in the stride tuple. The permuted coordinate tuple can be computed as follows:
$$
\begin{align}
\mathbf{x}^{\prime}
&= (x_0^{\prime}, x_1^{\prime}, \ldots, x_{\alpha}^{\prime}) \\
&= \left(f_L{(\mathbf{x})} \mod M_0^{\prime}, \left\lfloor \frac{f_L{(\mathbf{x})}}{M_0^{\prime}} \right\rfloor \mod M_1^{\prime}, \ldots, \left\lfloor \frac{f_L{(\mathbf{x})}}{M_0^{\prime} \cdot M_1^{\prime} \cdot \ldots \cdot M_{\alpha - 1}^{\prime}} \right\rfloor \mod M_{\alpha}^{\prime}\right) \\
&= \left(\left\lfloor \frac{f_L{(\mathbf{x})}}{d_0^{\prime}} \right\rfloor \mod M_0^{\prime}, \left\lfloor \frac{f_L{(\mathbf{x})}}{d_1^{\prime}} \right\rfloor \mod M_1^{\prime}, \ldots, \left\lfloor \frac{f_L{(\mathbf{x})}}{d_{\alpha}^{\prime}} \right\rfloor \mod M_{\alpha}^{\prime}\right) \\
\end{align}
$$
Because of this, there is no need to sort the layout just in order to compute the product accumulation of the shape tuple. The original coordinate tuple can be computed as follows:
$$
\begin{align}
\mathbf{x} &= (x_0, x_1, \ldots, x_{\alpha}) \\
&= \left(\left\lfloor \frac{f_L{(\mathbf{x})}}{d_0} \right\rfloor \mod M_0, \left\lfloor \frac{f_L{(\mathbf{x})}}{d_1} \right\rfloor \mod M_1, \ldots, \left\lfloor \frac{f_L{(\mathbf{x})}}{d_{\alpha}} \right\rfloor \mod M_{\alpha}\right) \\
\end{align}
$$
Conclusions
The inverse layout function of a compact layout can be derived from the coordinate isomorphism.
References
CuTe Index To Coordinate