Modal decompositions – POD, BPOD, and DMD¶

This tutorial discusses computing modes from data, using the Proper Orthogonal Decomposition (POD), Balanced Proper Orthogonal Decomposition (BPOD), and Dynamic Mode Decomposition (DMD). For details of these algorithms, see [HLBR] for POD and BPOD and [TRLBK] for DMD.

The first step is to collect your data. We call each piece of data a “vector” or “vector object”. By vector, we don’t mean a 1D array. Rather, we mean an element of a vector space. This could be a 1D, 2D, or 3D array, or any other object that satisfies the properties of a vector space. The examples below build on one another, each introducing new aspects.

Example 1 – All data in an array¶

A simple way to find POD modes is:

import numpy as np

import modred as mr


# Create random data
num_vecs = 30
vecs = np.random.random((100, num_vecs))

# Compute POD
num_modes = 5
POD_res = mr.compute_POD_arrays_snaps_method(
    vecs, list(mr.range(num_modes)))
modes = POD_res.modes
eigvals = POD_res.eigvals

Let’s walk through the important steps. First, we create an array of random data. Each column is a vector represented as a 1D array. Then we call the function compute_POD_arrays_snaps_method, which returns the first num_modes modes as columns of the array modes, and all of the non-zero eigenvalues, sorted from largest to smallest.

This function implements the “method of snapshots”, as described in Section 3.4 of [HLBR]. In short, it computes the correlation array $X^* X$ , where $X$ is vecs, then finds its eigenvectors and eigenvalues, which are related to the modes.

Example 2 – Inner products with all data in an array¶

You can use a weighted inner product, specified here by a 1D array of weights, so that the correlation array is $X^* W X$ , where $X$ is vecs and $W$ contains the inner product weights. The weights also can, more generally, be an array. The vectors are again represented as columns of an array.

import numpy as np

import modred as mr


# Create random data
num_vecs = 100
nx = 100
vecs = np.random.random((nx, num_vecs))

# Define non-uniform grid and corresponding inner product weights
x_grid = 1. - np.cos(np.linspace(0, np.pi, nx))
x_diff = np.diff(x_grid)
weights = 0.5 * np.append(np.append(
    x_diff[0], x_diff[:-1] + x_diff[1:]), x_diff[-1])

# Compute POD
num_modes = 10
POD_res = mr.compute_POD_arrays_direct_method(
    vecs, list(mr.range(num_modes)), inner_product_weights=weights)
modes = POD_res.modes
eigvals = POD_res.eigvals

This function computes the singular value decomposition (SVD) of $W^{1/2} X$ , and we refer to this as the “direct method” to distinguish it from the method of snapshots in the previous example. The differences between the two are insignificant in most cases. For details, see pod.compute_POD_arrays_direct_method().

Example 3 – Vector handles for loading and saving¶

This example demonstrates vector handles, which are very important because they allow modred to interact with large and complicated datasets without requiring that all of the vectors be stacked into a single array. This is necessary, for example, if the data is too large to all fit in memory simultaneously.

The following example computes direct and adjoint modes using Balanced POD (see Chapter 5 of [HLBR]):

import os

import numpy as np

import modred as mr


# Create directory for output files
out_dir = 'tutorial_ex3_out'
if not os.path.isdir(out_dir):
    os.makedirs(out_dir)

# Define handles for snapshots
num_vecs = 30
direct_snapshots = [
    mr.VecHandleArrayText('%s/direct_vec%d.txt' % (out_dir, i))
    for i in mr.range(num_vecs)]
adjoint_snapshots = [
    mr.VecHandleArrayText('%s/adjoint_vec%d.txt' % (out_dir, i))
    for i in mr.range(num_vecs)]

# Save random snapshot data in text files
x = np.linspace(0, np.pi, 200)
for i, snap in enumerate(direct_snapshots):
    snap.put(np.sin(x * i))
for i, snap in enumerate(adjoint_snapshots):
    snap.put(np.cos(0.5 * x * i))

# Calculate and save BPOD modes
my_BPOD = mr.BPODHandles(inner_product=np.vdot, max_vecs_per_node=10)
sing_vals, L_sing_vecs, R_sing_vecs = my_BPOD.compute_decomp(
    direct_snapshots, adjoint_snapshots)
num_modes = 10
mode_nums = list(mr.range(num_modes))
direct_modes = [
    mr.VecHandleArrayText('%s/direct_mode%d' % (out_dir, i))
    for i in mode_nums]
adjoint_modes = [
    mr.VecHandleArrayText('%s/adjoint_mode%d' % (out_dir, i))
    for i in mode_nums]
my_BPOD.compute_direct_modes(mode_nums, direct_modes)
my_BPOD.compute_adjoint_modes(mode_nums, adjoint_modes)

First, we create lists direct_snapshots and adjoint_snapshots. (“Snapshots” is just a common word to describe vectors that come from time-sampling a system as it evolves.) Each element in these lists is a vector handle (in particular, an instance of VecHandleArrayText). All vector handles have member functions to load, vec = vec_handle.get(), and save, vec_handle.put(vec), but themselves use very little memory because they do not internally contain a vector. modred uses these vector handles to load and save individual vectors only as it needs them.

All the snapshots are vectors and each is represented as an array, but they are not stacked into a single 2D array. Ordinarily the snapshots would already exist, for instance as files from a simulation or experiment. Here, we artificially generate snapshots and write them to file using the put() method.

Next, we compute the BPOD modes. The BPODHandles constructor takes an optional argument max_vecs_per_node=10, specifying that only 10 vectors (snapshots + modes) can be in one node’s memory at one time. The function compute_decomp takes lists of vector handles as arguments. In this example, note that there are 30 direct and 30 adjoint snapshots, so handles are necessary to avoid violating max_vecs_per_node=10.

Similarly, compute_direct_modes and compute_adjoint_modes take lists of handles. The modes are saved via handle.put() rather than returned, which might require too much memory.

Replacing VecHandleArrayText with VecHandlePickle would load/save all vectors (snapshots and/or modes) to Python’s binary pickle files. Note that pickling works with any type of vector, including user-defined ones, whereas saving to text is only written for 1D and 2D arrays.

To run this example in parallel is easy. The only complication is the data must be saved by only one processor, and moving these lines inside an if block solves this:

parallel = mr.parallel_default_instance
if parallel.is_rank_zero():
    # Loops that call handles.put
    pass
parallel.barrier()

After this change, the code will still work in serial, even if mpi4py is not installed. To run this, where the above script is saved as main_bpod.py, execute:

mpiexec -n 8 python main_bpod.py

It is rare to need to handle parallelization yourself, but if you do, you should use the provided mr.parallel_default_instance instance as above. Also provided are member functions parallel.get_rank() and parallel.get_num_procs() (see parallel for details).

If you’re curious, the text files are saved in a format defined in util.load_array_text() and util.save_array_text().

Example 4 – Inner product function¶

You are free to use any inner product function, as long as it has the interface value = inner_product(vec1, vec2) and satisfies the mathematical definition of an inner product (see Interfacing with your data). This example uses the trapezoidal rule for inner products on an arbitrary n-dimensional cartesian grid (see vectors.InnerProductTrapz). The object weighted_IP is callable (it has a special method __call__) so it acts as the inner product the usual way: value = weighted_IP(vec1, vec2).

import os

import numpy as np

from modred import parallel
import modred as mr


# Create directory for output files
out_dir = 'tutorial_ex4_out'
if not os.path.isdir(out_dir):
    os.makedirs(out_dir)

# Define non-uniform grid and corresponding inner product weights
nx = 80
ny = 100
x_grid = 1. - np.cos(np.linspace(0, np.pi, nx))
y_grid = np.linspace(0, 1., ny) ** 2
Y, X = np.meshgrid(y_grid, x_grid)

# Create random snapshot data
num_vecs = 100
snapshots = [
    mr.VecHandlePickle('%s/vec%d.pkl' % (out_dir, i))
    for i in mr.range(num_vecs)]
if parallel.is_rank_zero():
    for i, snap in enumerate(snapshots):
        snap.put(np.sin(X * i) + np.cos(Y * i))
parallel.barrier()

# Calculate DMD modes and save them to pickle files
weighted_IP = mr.InnerProductTrapz(x_grid, y_grid)
my_DMD = mr.DMDHandles(inner_product=weighted_IP)
my_DMD.compute_decomp(snapshots)
my_DMD.put_decomp(
    '%s/eigvals.txt' % out_dir, '%s/R_low_order_eigvecs.txt' % out_dir,
    '%s/L_low_order_eigvecs.txt' % out_dir,
    '%s/correlation_array_eigvals.txt' % out_dir,
    '%s/correlation_array_eigvecs.txt' % out_dir)
mode_indices = [1, 4, 5, 0, 10]
modes = [
    mr.VecHandlePickle('%s/mode%d.pkl' % (out_dir, i)) for i in mode_indices]
my_DMD.compute_exact_modes(mode_indices, modes)

Also shown in this example is the useful put_decomp, which, by default saves the arrays associated with the decomposition to text files. (See Array input and output.)

Again, this code can be executed in parallel without any modifications.

Example 5 – Shifting and scaling vectors using vector handles¶

Often vectors contain an offset (also called a shift or translation) such as a mean or equilibrium state, and one might want to do model reduction with this known offset removed. We call this offset the base vector, and it can be subtracted off by the vector handle class as shown in this example.

Note that handle.put does not use the base vector; the base vector is only subtracted by handle.get.

You might also want to scale all of your vectors by factors as you retrieve them for use in modred, and this can also be done by the handle.get function. When using both base vector shifting and scaling, the default order is first shifting then scaling: (vec - base_vec)*scale.

This examples uses quadrature weights, where each vector is weighted. It also shows how to load vectors in one format (pickle, via mr.VecHandlePickle) and save modes in another (text, via mr.VecHandleArrayText).

import os

import numpy as np

from modred import parallel
import modred as mr


# Create directory for output files
out_dir = 'tutorial_ex5_out'
if not os.path.isdir(out_dir):
    os.makedirs(out_dir)

# Create artificial sample times used as quadrature weights in POD
num_vecs = 100
quad_weights = np.logspace(1., 3., num=num_vecs)
base_vec_handle = mr.VecHandlePickle('%s/base_vec.pkl' % out_dir)
snapshots = [
    mr.VecHandlePickle(
        '%s/vec%d.pkl' % (out_dir, i),base_vec_handle=base_vec_handle,
        scale=quad_weights[i])
    for i in mr.range(num_vecs)]

# Save arbitrary snapshot data
num_elements = 2000
if parallel.is_rank_zero():
    for snap in snapshots + [base_vec_handle]:
        snap.put(np.random.random(num_elements))
parallel.barrier()

# Compute and save POD modes
my_POD = mr.PODHandles(inner_product=np.vdot)
my_POD.compute_decomp(snapshots)
my_POD.put_decomp('%s/sing_vals.txt' % out_dir, '%s/sing_vecs.txt' % out_dir)
my_POD.put_correlation_array('%s/correlation_array.txt' % out_dir)
mode_indices = [1, 4, 5, 0, 10]
modes = [
    mr.VecHandleArrayText('%s/mode%d.txt' % (out_dir, i)) for i in mode_indices]
my_POD.compute_modes(mode_indices, modes)

# Check that modes are orthonormal
vec_space = mr.VectorSpaceHandles(inner_product=np.vdot)
IP_array = vec_space.compute_symm_inner_product_array(modes)
if not np.allclose(IP_array, np.eye(len(mode_indices))):
    print('Warning: modes are not orthonormal')
    print(IP_array)

At the end of this example, we use an instance of the low-level class vectorspace.VectorSpaceHandles to check that the POD modes are orthonormal. It’s generally not necessary to use this class (or do this check), but if the need arises, it should be used (instead of writing new code) since it is tested and parallelized.

Example 6 – User-defined vectors and handles¶

So far, all of the vectors have been arrays, but you may want to apply modred to data saved in your own custom format with more complicated inner products and other operations. This is no problem at all; modred works with data in any format! That’s worth saying again: modred works with data in any format! Of course, you’ll have to tell modred how to interact with your data, but that’s pretty easy. You just need to define and use your own vector handle and vector objects.

There are two important new features of this example: a custom vector class CustomVector and a custom vector handle class CustomVecHandle. These definitions may be collected together in a file, for instance called customvector.py:

import pickle
from copy import deepcopy

import numpy as np

import modred as mr


class CustomVector(mr.Vector):
    def __init__(self, grids, data_array):
        self.grids = grids
        self.data_array = data_array
        self.weighted_ip = mr.InnerProductTrapz(*self.grids)


    def __add__(self, other):
        """Return a new object that is the sum of self and other"""
        sum_vec = deepcopy(self)
        sum_vec.data_array = self.data_array + other.data_array
        return sum_vec


    def __mul__(self, scalar):
        """Return a new object that is ``self * scalar`` """
        mult_vec = deepcopy(self)
        mult_vec.data_array = mult_vec.data_array * scalar
        return mult_vec


    def inner_product(self, other):
        return self.weighted_ip(self.data_array, other.data_array)


class CustomVecHandle(mr.VecHandle):
    def __init__(self, vec_path, base_handle=None, scale=None):
        mr.VecHandle.__init__(self, base_handle, scale)
        self.vec_path = vec_path


    def _get(self):
        file_id = open(self.vec_path, 'rb')
        grids = pickle.load(file_id)
        data_array = pickle.load(file_id)
        file_id.close()
        return CustomVector(grids, data_array)


    def _put(self, vec):
        file_id = open(self.vec_path, 'wb')
        pickle.dump(vec.grids, file_id)
        pickle.dump(vec.data_array, file_id)
        file_id.close()


def inner_product(v1, v2):
    return v1.inner_product(v2)

Instances of CustomVector meet the requirements for a vector object: vector addition __add__ and scalar multiplication __mul__ are defined, and the objects are compatible with an inner product function such as inner_product(v1, v2). Note that CustomVector inherits from a base class mr.Vector. This is not required, but is recommended, as the base class provides some useful additional methods. The member function inner_product is useful, but not required. This example also uses the trapezoidal rule for inner products to account for a 3D arbitrary cartesian grid (vectors.InnerProductTrapz).

The class CustomVecHandle inherits from a base class mr.VecHandle, and defines methods _get and _put, which load and save vectors from/to Pickle files. Note the leading underscore: the functions get and put (without leading underscore) are defined in the VecHandle base class, and take care of scaling or subtracting a base vector. The methods _get and _put defined here are in turn called by the base class (the “template method” design pattern). While you don’t need to understand the guts of these base classes, more is covered in Interfacing with your data.

Here’s an example using these classes:

import os

import numpy as np

from modred import parallel
import modred as mr
from customvector import CustomVector, CustomVecHandle, inner_product


# Create directory for output files
out_dir = 'tutorial_ex6_out'
if not os.path.isdir(out_dir):
    os.makedirs(out_dir)

# Define snapshot handles
direct_snapshots = [
    CustomVecHandle('%s/direct_snap%d.pkl' % (out_dir, i), scale=np.pi)
    for i in mr.range(10)]
adjoint_snapshots = [
    CustomVecHandle('%s/adjoint_snap%d.pkl' % (out_dir, i), scale=np.pi)
    for i in mr.range(10)]

# Create random snapshot data
nx = 50
ny = 30
nz = 20
x = np.linspace(0, 1, nx)
y = np.logspace(1, 2, ny)
z = np.linspace(0, 1, nz) ** 2
if parallel.is_rank_zero():
    for snap in direct_snapshots + adjoint_snapshots:
        snap.put(CustomVector([x, y, z], np.random.random((nx, ny, nz))))
parallel.barrier()

# Compute and save balanced POD modes
my_BPOD = mr.BPODHandles(inner_product=inner_product)
my_BPOD.sanity_check(direct_snapshots[0])
sing_vals, L_sing_vecs, R_sing_vecs = my_BPOD.compute_decomp(
    direct_snapshots, adjoint_snapshots)

# Choose modes so that BPOD projection has less than 10% error
sing_vals_norm = sing_vals / np.sum(sing_vals)
num_modes = np.nonzero(np.cumsum(sing_vals_norm) > 0.9)[0][0] + 1
mode_nums = list(mr.range(num_modes))
direct_modes = [
    CustomVecHandle('%s/direct_mode%d.pkl' % (out_dir, i)) for i in mode_nums]
adjoint_modes = [
    CustomVecHandle('%s/adjoint_mode%d.pkl' % (out_dir, i)) for i in mode_nums]
my_BPOD.compute_direct_modes(mode_nums, direct_modes)
my_BPOD.compute_adjoint_modes(mode_nums, adjoint_modes)

This example is similar to previous ones, but some aspects are worth pointing out. The call to my_BPOD.sanity_check() runs some basic tests to verify that the vector handles and vectors behave as expected, so this can be useful for debugging. After execution, the modes are saved to direct_mode0.pkl, direct_mode1.pkl … and adjoint_mode0.pkl, adjoint_mode1.pkl. Also, note that the only time the CustomVector class is used is in generating the “fake” random data: most scripts will only need to deal with the vector handle classes, not the vectors themselves.

When you’re ready to start using modred, take a look at what types of vectors, file formats, and inner products are supplied in vectors. If you don’t find what you need, you can define your own vectors and vector handles following examples like this. Built-in classes like VecHandleArrayText and VecHandlePickle are common cases and serve as good examples since your own custom vector handle classes will probably resemble them.

As usual, this example can be executed in parallel without any modifications.

Array input and output¶

By default, put_array and get_array save and load to text files. If you prefer a different format, you can pass your own functions as keyword arguments put_array and get_array to the constructors. The functions should have the following interfaces: put_array(array, array_dest) and array = get_array(array_source). The array argument could be a 1D or 2D array.

References¶

[HLBR]

(1, 2, 3) P. Holmes, J. L. Lumley, G. Berkooz, and C. W. Rowley. Turbulence, Coherent Structures, Dynamical Systems and Symmetry, 2nd edition, Cambridge University Press, 2012.

[TRLBK]

J. H. Tu, C. W. Rowle, D. M. Luchtenburg, S. L. Brunton, J. N. Kutz. On Dynamic Mode Decmposition: Theory and Applications. Journal of Computational Dynamics, 1:391-421, Dec. 2014.