This page provides very high level technical information about microstructure discretization, 2-point statistics, MKS Localization and MKS Homogenization. For more detailed information, see the MINED research group's website.

The first step in all of MKS workflows is to discretize microstructures. To do this, we create a probabilistic description of a microstructure by introducing the continuous local state variable $h$, the local state space $H$ and the microstructure function $m(h, x)$. The local state space $H$ can be thought of as all of the thermodynamic state variables that are needed to uniquely define the material structure at a given location. The local state variable $h$ is one instance of the local state space, or one configuration of state variables. The microstructure function $m(h, x)$ is a probability density function of finding a a local state $h$ at location $x$. For instance, let $\mu(x)$ be a microstructure that we plan to discretize, then $\mu$ is the expectation of the microstructure function.

$$ \mu(x) = \int_H h m(h, x) dh $$Now, we will discretize the microstructure in space by averaging over small cubic domains in the microstructure function. The local state can be discretized, using two methods: one is to bin the microstructure, using the primitive basis $\Lambda$

$$ \frac{1}{\Delta x} \int_{H} \int_{s} \Lambda(h - l) m(h, x) dx dh = m[l, s] $$the other is to use a spectral representation using some orthogonal basis function $\xi$

$$ \frac{1}{\Delta x} \int_{s} m(h, x) dx dt = \sum_{l=0}^{L-1} m[l, s] \xi_l (h) $$In the notation above, all of the round brackets variables are continuous variables and the square brackets variables are the discrete variables. The variables $s$ and $S$ represent a discrete position and the total volume, while $l$ and $L$ represent the discrete versions of $h$ and $H$. In PyMKS the Legendre polynomials are currently the only orthgonal basis functions available.

Either of these two discretization methods are used to discretize the microstructure.

N-point spatial correlations provide a way to rigorously quantify material structure, using statistics. As an introduction to n-point spatial correlations, let's first discuss 1-point statistics. 1-point statistics are the probability that a specified local state will be found in any randomly selected spatial bin in a microstructure [1][2][3]. 1-point statistics compute the volume fractions of the local states in the microstructure. 1-point statistics are computed as

$$ f[l] = \frac{1}{S} \sum_s m[s,l] $$In this equation, $f[l]$ is the probability of finding the local state $l$ in any randomly selected spatial bin in the microstructure, $m[s, l]$ is the microstructure function (the digital representation of the microstructure), $S$ is the total number of spatial bins in the microstructure and $s$ refers to a specific spatial bin.

While 1-point statistics provide information on the relative amounts of the different local states, they do not provide any information about how those local states are spatially arranged in the microstructure. Therefore, 1-point statistics are a limited set of metrics to describe the structure of materials.

2-point spatial correlations (also known as 2-point statistics) contain information about the fractions of local states as well as the first order information on how the different local states are distributed in the microstructure.

2-point statistics can be thought of as the probability of having a vector placed randomly in the microstructure and having one end of the vector be on one specified local state and the other end on another specified local state. This vector could have any length or orientation that the discrete microstructure allows. The equation for 2-point statistics can be found below.

$$ f[r \vert l, l'] = \frac{1}{S} \sum_s m[s, l] m[s + r, l'] $$In this equation $f[r \vert l, l']$ is the conditional probability of finding the local states $l$ and $l'$ at a distance and orientation away from each other defined by the vector $r$. All other variables are the same as those in the 1-point statistics equation. In the case that we have an eigen microstructure function (it only contains values of 0 or 1) and we are using an indicator basis, the the $r=0$ vector will recover the 1-point statistics.

When the 2 local states are the same $l = l'$, it is referred to as a *autocorrelation*. If the 2 local states are not the same, it is referred to as a *cross-correlation*.

Higher order spatial statistics are similar to 2-point statistics, in that they can be thought of in terms of conditional probabilities of finding specified local states separated by a prescribed set of vectors. 3-point statistics are the probability of finding three specified local states at the ends of a triangle (defined by 2 vectors), placed randomly in the material structure. 4-point statistics describes the probability of finding 4 local states at 4 locations (defined, using 3 vectors) and so on.

While higher order statistics are a better metric to quantify the material structure, the 2-point statistics can be computed much faster than higher order spatial statistics, and still provide information about how the local states are distributed. For this reason, currently only 2-point statistics are implemented in PyMKS.

*Homogenization* can be used to determine effective or homogenized properties a material, and provides a way to multiscale from the bottom up. Below is some technical information on the MKS Homogenization workflow.

The first step in MKS homogenization is to compute the 2-point statistics for each of the microstructures in the calibration dataset. For more information about 2-point statistics, see the section above.

Once we have computed the 2-point statistics for every microstructure in our calibration dataset, we need to determine which of these microstructure features are most important and need to be pass to the higher length scale. We can do this, using dimensionality reduction techniques from machine learning.

PCA is the most common dimensionality reduction technique used in the MKS Homogenization workflow. PCA provides an efficient way to find a small number of microstructure descriptors that capture most of the variance in microstructures.

Once we have the low dimensional microstructure descriptors, we can use regression methods to map an effective property into the low dimensional space.

Multivariate Polynomial Regression has been the most common regression technique, used to connect the low dimensional microstructure descriptors to an effective property.

An effective property for a new microstructure is predicted by translating into the microstructure into the low dimensional space and using the regression technique.

*Localization* can be used to determine how an applied boundary condition is locally distributed within a microstructure, and provides method to multiscale from the top down. Below is some technical information on the MKS Localization workflow.

Once the state space is discretized, the relationship between the response field $p$ and microstructure function $m$ can be written as

$$ p\left[s\right] = \sum_{r=0}^{S-1} \sum_{l=0}^{L-1} \alpha\left[l, r\right] m \left[l, s - r\right] + ...$$where $\alpha$ are known as the influence coefficients and describe the relationship between $p$ and $m$. The localization requires periodic boundary conditions.

The first-order influence coefficients in the MKS Localization equation can be efficiently found, using DFTs and linear regression. The time complexity of the convolution in real space is $O(LMN^2)$, while the same results found, using FFTs, is $O(LMNlogN)$, where $L$ is the number of local states and $M$ is the number of samples.

The convolution

$$\sum_{r=0}^{S-1} \sum_{l=0}^{L-1} \alpha[l, r] m[l, s - r] $$

can be deconvolved in Fourier space, using the circular convolution theorem.

If we write $P \left[k \right] = \mathcal{F}_k \{ p\left[s\right] \}$, $M\left[l, k\right]= \mathcal{F}_k \{ m\left[l, s\right] \}$ and $\beta\left[l, k\right] = \mathcal{F}_k \{ \alpha \left[l, s\right] \}$, then we just need to solve

$$ P\left[k\right] = \sum_{l=0}^{L-1} \beta \left[l, k\right] M \left[l, k\right] $$

with a linear regression at each discretization location in $k$ to find the $\beta$ terms.

More information about MKS workflows can be found on MINED research group's webite

```
In [ ]:
```