Skip to content

Model inverse

Neural Network Model Implementation for PDE Parameter Inverse Problems.

This module implements the neural network architecture and training loop for solving inverse problems in PDEs using variational physics-informed neural networks (VPINNs). It focuses on identifying constant parameters while solving the underlying PDE simultaneously.

The implementation supports
  • Parameter identification
  • Sensor data incorporation
  • Dirichlet boundary conditions
  • Custom loss function composition
  • Adaptive learning rate scheduling
  • Attention mechanisms (optional)
  • Efficient tensor operations
Key classes
  • DenseModel_Inverse: Neural network model for inverse problems
Authors
  • Thivin Anandh (https://thivinanandh.github.io/)
Versions
  • 27-Dec-2024 (Version 0.1): Initial Implementation

DenseModel_Inverse

Bases: Model

Neural network model for PDE parameter identification.

This class implements a custom neural network architecture for solving inverse problems in PDEs. It combines parameter identification with PDE solution while incorporating sensor data and boundary conditions.

Attributes:

Name Type Description
layer_dims

List of neurons per layer including input/output

learning_rate_dict

Learning rate configuration containing: - initial_learning_rate: Starting learning rate - use_lr_scheduler: Whether to use learning rate decay - decay_steps: Steps between learning rate updates - decay_rate: Factor for learning rate decay - staircase: Whether to use staircase decay

params_dict

Model parameters including: - n_cells: Number of cells in the domain

loss_function

Custom loss function for PDE residuals

tensor_dtype

TensorFlow data type for computations

sensor_list

List containing: [0]: sensor_points - Coordinates of sensor locations [1]: sensor_values - Measured values at sensors

inverse_params_dict

Dictionary of parameters to be identified, converted to trainable variables

use_attention

Whether to use attention mechanism

activation

Activation function for hidden layers

optimizer

Adam optimizer with optional learning rate schedule

Example

model = DenseModel_Inverse( ... layer_dims=[2, 64, 64, 1], ... learning_rate_dict={'initial_learning_rate': 0.001}, ... params_dict={'n_cells': 100}, ... loss_function=custom_loss, ... tensor_dtype=tf.float32, ... sensor_list=[sensor_points, sensor_values], ... inverse_params_dict={'eps': 0.1} ... ) history = model.fit(x_train, epochs=1000)

Note

The training process balances PDE residuals, boundary conditions, sensor data matching, and parameter identification through a weighted loss function.

Source code in scirex/core/sciml/fastvpinns/model/model_inverse.py
class DenseModel_Inverse(tf.keras.Model):
    """Neural network model for PDE parameter identification.

    This class implements a custom neural network architecture for solving
    inverse problems in PDEs. It combines parameter identification with
    PDE solution while incorporating sensor data and boundary conditions.

    Attributes:
        layer_dims: List of neurons per layer including input/output
        learning_rate_dict: Learning rate configuration containing:
            - initial_learning_rate: Starting learning rate
            - use_lr_scheduler: Whether to use learning rate decay
            - decay_steps: Steps between learning rate updates
            - decay_rate: Factor for learning rate decay
            - staircase: Whether to use staircase decay
        params_dict: Model parameters including:
            - n_cells: Number of cells in the domain
        loss_function: Custom loss function for PDE residuals
        tensor_dtype: TensorFlow data type for computations
        sensor_list: List containing:
            [0]: sensor_points - Coordinates of sensor locations
            [1]: sensor_values - Measured values at sensors
        inverse_params_dict: Dictionary of parameters to be identified,
            converted to trainable variables
        use_attention: Whether to use attention mechanism
        activation: Activation function for hidden layers
        optimizer: Adam optimizer with optional learning rate schedule

    Example:
        >>> model = DenseModel_Inverse(
        ...     layer_dims=[2, 64, 64, 1],
        ...     learning_rate_dict={'initial_learning_rate': 0.001},
        ...     params_dict={'n_cells': 100},
        ...     loss_function=custom_loss,
        ...     tensor_dtype=tf.float32,
        ...     sensor_list=[sensor_points, sensor_values],
        ...     inverse_params_dict={'eps': 0.1}
        ... )
        >>> history = model.fit(x_train, epochs=1000)

    Note:
        The training process balances PDE residuals, boundary conditions,
        sensor data matching, and parameter identification through a
        weighted loss function.
    """

    def __init__(
        self,
        layer_dims: list,
        learning_rate_dict: dict,
        params_dict: dict,
        loss_function,
        input_tensors_list: list,
        orig_factor_matrices: list,
        force_function_list: list,
        sensor_list: list,  # for inverse problem
        inverse_params_dict: dict,  # for inverse problem
        tensor_dtype,
        use_attention=False,
        activation="tanh",
        hessian=False,
    ):
        super(DenseModel_Inverse, self).__init__()
        self.layer_dims = layer_dims
        self.use_attention = use_attention
        self.activation = activation
        self.layer_list = []
        self.loss_function = loss_function
        self.hessian = hessian

        self.tensor_dtype = tensor_dtype

        self.sensor_list = sensor_list
        # obtain sensor values
        self.sensor_points = sensor_list[0]
        self.sensor_values = sensor_list[1]

        # inverse params dict
        self.inverse_params_dict = inverse_params_dict

        # Conver all the values within inverse_params_dict to trainable variables
        for key, value in self.inverse_params_dict.items():
            self.inverse_params_dict[key] = tf.Variable(
                value, dtype=self.tensor_dtype, trainable=True
            )
            tf.print(f"Key : {key} , Value : {self.inverse_params_dict[key]}")

        # add the sensor points to the trainable variables of the model
        self.trainable_variables.extend(self.inverse_params_dict.values())

        # if dtype is not a valid tensorflow dtype, raise an error
        if not isinstance(self.tensor_dtype, tf.DType):
            raise TypeError("The given dtype is not a valid tensorflow dtype")

        self.orig_factor_matrices = orig_factor_matrices
        self.shape_function_mat_list = copy.deepcopy(orig_factor_matrices[0])
        self.shape_function_grad_x_factor_mat_list = copy.deepcopy(
            orig_factor_matrices[1]
        )
        self.shape_function_grad_y_factor_mat_list = copy.deepcopy(
            orig_factor_matrices[2]
        )

        self.force_function_list = force_function_list

        self.input_tensors_list = input_tensors_list
        self.input_tensor = copy.deepcopy(input_tensors_list[0])
        self.dirichlet_input = copy.deepcopy(input_tensors_list[1])
        self.dirichlet_actual = copy.deepcopy(input_tensors_list[2])

        self.params_dict = params_dict

        self.pre_multiplier_val = self.shape_function_mat_list
        self.pre_multiplier_grad_x = self.shape_function_grad_x_factor_mat_list
        self.pre_multiplier_grad_y = self.shape_function_grad_y_factor_mat_list

        self.force_matrix = self.force_function_list

        print(f"{'-'*74}")
        print(f"| {'PARAMETER':<25} | {'SHAPE':<25} |")
        print(f"{'-'*74}")
        print(
            f"| {'input_tensor':<25} | {str(self.input_tensor.shape):<25} | {self.input_tensor.dtype}"
        )
        print(
            f"| {'force_matrix':<25} | {str(self.force_matrix.shape):<25} | {self.force_matrix.dtype}"
        )
        print(
            f"| {'pre_multiplier_grad_x':<25} | {str(self.pre_multiplier_grad_x.shape):<25} | {self.pre_multiplier_grad_x.dtype}"
        )
        print(
            f"| {'pre_multiplier_grad_y':<25} | {str(self.pre_multiplier_grad_y.shape):<25} | {self.pre_multiplier_grad_y.dtype}"
        )
        print(
            f"| {'pre_multiplier_val':<25} | {str(self.pre_multiplier_val.shape):<25} | {self.pre_multiplier_val.dtype}"
        )
        print(
            f"| {'dirichlet_input':<25} | {str(self.dirichlet_input.shape):<25} | {self.dirichlet_input.dtype}"
        )
        print(
            f"| {'dirichlet_actual':<25} | {str(self.dirichlet_actual.shape):<25} | {self.dirichlet_actual.dtype}"
        )
        print(f"{'-'*74}")

        self.n_cells = params_dict["n_cells"]

        ## ----------------------------------------------------------------- ##
        ## ---------- LEARNING RATE AND OPTIMISER FOR THE MODEL ------------ ##
        ## ----------------------------------------------------------------- ##

        # parse the learning rate dictionary
        self.learning_rate_dict = learning_rate_dict
        initial_learning_rate = learning_rate_dict["initial_learning_rate"]
        use_lr_scheduler = learning_rate_dict["use_lr_scheduler"]
        decay_steps = learning_rate_dict["decay_steps"]
        decay_rate = learning_rate_dict["decay_rate"]
        staircase = learning_rate_dict["staircase"]

        if use_lr_scheduler:
            learning_rate_fn = tf.keras.optimizers.schedules.ExponentialDecay(
                initial_learning_rate, decay_steps, decay_rate, staircase=True
            )
        else:
            learning_rate_fn = initial_learning_rate

        self.optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate_fn)

        ## ----------------------------------------------------------------- ##
        ## --------------------- MODEL ARCHITECTURE ------------------------ ##
        ## ----------------------------------------------------------------- ##

        # Build dense layers based on the input list
        for dim in range(len(self.layer_dims) - 2):
            self.layer_list.append(
                layers.Dense(
                    self.layer_dims[dim + 1],
                    activation=self.activation,
                    kernel_initializer="glorot_uniform",
                    dtype=self.tensor_dtype,
                    bias_initializer="zeros",
                )
            )

        # Add a output layer with no activation
        self.layer_list.append(
            layers.Dense(
                self.layer_dims[-1],
                activation=None,
                kernel_initializer="glorot_uniform",
                dtype=self.tensor_dtype,
                bias_initializer="zeros",
            )
        )

        # Add attention layer if required
        if self.use_attention:
            self.attention_layer = layers.Attention()

        # Compile the model
        self.compile(optimizer=self.optimizer)
        self.build(input_shape=(None, self.layer_dims[0]))

        # print the summary of the model
        self.summary()

    def call(self, inputs):
        """
        Applies the model to the input data.

        Args:
            inputs: The input data.

        Returns:
            The output of the model after applying all the layers.
        """
        x = inputs

        # Apply attention layer after input if flag is True
        if self.use_attention:
            x = self.attention_layer([x, x])

        # Loop through the dense layers
        for layer in self.layer_list:
            x = layer(x)

        return x

    def get_config(self):
        """
        Returns the configuration of the model.

        This method is used to serialize the model configuration. It returns a dictionary
        containing all the necessary information to recreate the model.

        Returns:
            dict: The configuration dictionary of the model.
        """
        # Get the base configuration
        base_config = super().get_config()

        # Add the non-serializable arguments to the configuration
        base_config.update(
            {
                "learning_rate_dict": self.learning_rate_dict,
                "loss_function": self.loss_function,
                "input_tensors_list": self.input_tensors_list,
                "orig_factor_matrices": self.orig_factor_matrices,
                "force_function_list": self.force_function_list,
                "params_dict": self.params_dict,
                "use_attention": self.use_attention,
                "activation": self.activation,
                "hessian": self.hessian,
                "layer_dims": self.layer_dims,
                "tensor_dtype": self.tensor_dtype,
                "sensor_list": self.sensor_list,
                "inverse_params_dict": self.inverse_params_dict,
            }
        )

        return base_config

    @tf.function
    def train_step(
        self, beta=10, beta_sensor=10, bilinear_params_dict=None
    ) -> dict:  # pragma: no cover
        """
        Performs a single training step on the model.

        Args:
            beta: The weight for the boundary condition loss.
            bilinear_params_dict: The bilinear parameters dictionary.

        Returns:
            dict: A dictionary containing the loss values.
        """

        with tf.GradientTape(persistent=True) as tape:
            # Predict the values for dirichlet boundary conditions
            predicted_values_dirichlet = self(self.dirichlet_input)

            # predict the sensor values
            predicted_sensor_values = self(self.sensor_points)

            # initialize total loss as a tensor with shape (1,) and value 0.0
            total_pde_loss = 0.0

            with tf.GradientTape(persistent=True) as tape1:
                # tape gradient
                tape1.watch(self.input_tensor)
                # Compute the predicted values from the model
                predicted_values = self(self.input_tensor)

            # compute the gradients of the predicted values wrt the input which is (x, y)
            gradients = tape1.gradient(predicted_values, self.input_tensor)

            # Split the gradients into x and y components and reshape them to (-1, 1)
            # the reshaping is done for the tensorial operations purposes (refer Notebook)
            pred_grad_x = tf.reshape(
                gradients[:, 0], [self.n_cells, self.pre_multiplier_grad_x.shape[-1]]
            )  # shape : (N_cells , N_quadrature_points)
            pred_grad_y = tf.reshape(
                gradients[:, 1], [self.n_cells, self.pre_multiplier_grad_y.shape[-1]]
            )  # shape : (N_cells , N_quadrature_points)

            pred_val = tf.reshape(
                predicted_values, [self.n_cells, self.pre_multiplier_val.shape[-1]]
            )  # shape : (N_cells , N_quadrature_points)

            cells_residual = self.loss_function(
                test_shape_val_mat=self.pre_multiplier_val,
                test_grad_x_mat=self.pre_multiplier_grad_x,
                test_grad_y_mat=self.pre_multiplier_grad_y,
                pred_nn=pred_val,
                pred_grad_x_nn=pred_grad_x,
                pred_grad_y_nn=pred_grad_y,
                forcing_function=self.force_matrix,
                bilinear_params=bilinear_params_dict,
                inverse_params_dict=self.inverse_params_dict,
            )

            residual = tf.reduce_sum(cells_residual)

            # tf.print("Residual : ", residual)
            # tf.print("Residual Shape : ", residual.shape)

            # Compute the total loss for the PDE
            total_pde_loss = total_pde_loss + residual

            # convert predicted_values_dirichlet to tf.float64
            # predicted_values_dirichlet = tf.cast(predicted_values_dirichlet, tf.float64)

            # print shapes of the predicted values and the actual values
            boundary_loss = tf.reduce_mean(
                tf.square(predicted_values_dirichlet - self.dirichlet_actual), axis=0
            )

            # Sensor loss
            sensor_loss = tf.reduce_mean(
                tf.square(predicted_sensor_values - self.sensor_values), axis=0
            )

            # tf.print("Boundary Loss : ", boundary_loss)
            # tf.print("Boundary Loss Shape : ", boundary_loss.shape)
            # tf.print("Total PDE Loss : ", total_pde_loss)
            # tf.print("Total PDE Loss Shape : ", total_pde_loss.shape)

            # Compute Total Loss
            total_loss = (
                total_pde_loss + beta * boundary_loss + beta_sensor * sensor_loss
            )

        trainable_vars = self.trainable_variables
        self.gradients = tape.gradient(total_loss, trainable_vars)
        self.optimizer.apply_gradients(zip(self.gradients, trainable_vars))

        return {
            "loss_pde": total_pde_loss,
            "loss_dirichlet": boundary_loss,
            "loss": total_loss,
            "inverse_params": self.inverse_params_dict,
            "sensor_loss": sensor_loss,
        }

call(inputs)

Applies the model to the input data.

Parameters:

Name Type Description Default
inputs

The input data.

required

Returns:

Type Description

The output of the model after applying all the layers.

Source code in scirex/core/sciml/fastvpinns/model/model_inverse.py
def call(self, inputs):
    """
    Applies the model to the input data.

    Args:
        inputs: The input data.

    Returns:
        The output of the model after applying all the layers.
    """
    x = inputs

    # Apply attention layer after input if flag is True
    if self.use_attention:
        x = self.attention_layer([x, x])

    # Loop through the dense layers
    for layer in self.layer_list:
        x = layer(x)

    return x

get_config()

Returns the configuration of the model.

This method is used to serialize the model configuration. It returns a dictionary containing all the necessary information to recreate the model.

Returns:

Name Type Description
dict

The configuration dictionary of the model.

Source code in scirex/core/sciml/fastvpinns/model/model_inverse.py
def get_config(self):
    """
    Returns the configuration of the model.

    This method is used to serialize the model configuration. It returns a dictionary
    containing all the necessary information to recreate the model.

    Returns:
        dict: The configuration dictionary of the model.
    """
    # Get the base configuration
    base_config = super().get_config()

    # Add the non-serializable arguments to the configuration
    base_config.update(
        {
            "learning_rate_dict": self.learning_rate_dict,
            "loss_function": self.loss_function,
            "input_tensors_list": self.input_tensors_list,
            "orig_factor_matrices": self.orig_factor_matrices,
            "force_function_list": self.force_function_list,
            "params_dict": self.params_dict,
            "use_attention": self.use_attention,
            "activation": self.activation,
            "hessian": self.hessian,
            "layer_dims": self.layer_dims,
            "tensor_dtype": self.tensor_dtype,
            "sensor_list": self.sensor_list,
            "inverse_params_dict": self.inverse_params_dict,
        }
    )

    return base_config

train_step(beta=10, beta_sensor=10, bilinear_params_dict=None)

Performs a single training step on the model.

Parameters:

Name Type Description Default
beta

The weight for the boundary condition loss.

10
bilinear_params_dict

The bilinear parameters dictionary.

None

Returns:

Name Type Description
dict dict

A dictionary containing the loss values.

Source code in scirex/core/sciml/fastvpinns/model/model_inverse.py
@tf.function
def train_step(
    self, beta=10, beta_sensor=10, bilinear_params_dict=None
) -> dict:  # pragma: no cover
    """
    Performs a single training step on the model.

    Args:
        beta: The weight for the boundary condition loss.
        bilinear_params_dict: The bilinear parameters dictionary.

    Returns:
        dict: A dictionary containing the loss values.
    """

    with tf.GradientTape(persistent=True) as tape:
        # Predict the values for dirichlet boundary conditions
        predicted_values_dirichlet = self(self.dirichlet_input)

        # predict the sensor values
        predicted_sensor_values = self(self.sensor_points)

        # initialize total loss as a tensor with shape (1,) and value 0.0
        total_pde_loss = 0.0

        with tf.GradientTape(persistent=True) as tape1:
            # tape gradient
            tape1.watch(self.input_tensor)
            # Compute the predicted values from the model
            predicted_values = self(self.input_tensor)

        # compute the gradients of the predicted values wrt the input which is (x, y)
        gradients = tape1.gradient(predicted_values, self.input_tensor)

        # Split the gradients into x and y components and reshape them to (-1, 1)
        # the reshaping is done for the tensorial operations purposes (refer Notebook)
        pred_grad_x = tf.reshape(
            gradients[:, 0], [self.n_cells, self.pre_multiplier_grad_x.shape[-1]]
        )  # shape : (N_cells , N_quadrature_points)
        pred_grad_y = tf.reshape(
            gradients[:, 1], [self.n_cells, self.pre_multiplier_grad_y.shape[-1]]
        )  # shape : (N_cells , N_quadrature_points)

        pred_val = tf.reshape(
            predicted_values, [self.n_cells, self.pre_multiplier_val.shape[-1]]
        )  # shape : (N_cells , N_quadrature_points)

        cells_residual = self.loss_function(
            test_shape_val_mat=self.pre_multiplier_val,
            test_grad_x_mat=self.pre_multiplier_grad_x,
            test_grad_y_mat=self.pre_multiplier_grad_y,
            pred_nn=pred_val,
            pred_grad_x_nn=pred_grad_x,
            pred_grad_y_nn=pred_grad_y,
            forcing_function=self.force_matrix,
            bilinear_params=bilinear_params_dict,
            inverse_params_dict=self.inverse_params_dict,
        )

        residual = tf.reduce_sum(cells_residual)

        # tf.print("Residual : ", residual)
        # tf.print("Residual Shape : ", residual.shape)

        # Compute the total loss for the PDE
        total_pde_loss = total_pde_loss + residual

        # convert predicted_values_dirichlet to tf.float64
        # predicted_values_dirichlet = tf.cast(predicted_values_dirichlet, tf.float64)

        # print shapes of the predicted values and the actual values
        boundary_loss = tf.reduce_mean(
            tf.square(predicted_values_dirichlet - self.dirichlet_actual), axis=0
        )

        # Sensor loss
        sensor_loss = tf.reduce_mean(
            tf.square(predicted_sensor_values - self.sensor_values), axis=0
        )

        # tf.print("Boundary Loss : ", boundary_loss)
        # tf.print("Boundary Loss Shape : ", boundary_loss.shape)
        # tf.print("Total PDE Loss : ", total_pde_loss)
        # tf.print("Total PDE Loss Shape : ", total_pde_loss.shape)

        # Compute Total Loss
        total_loss = (
            total_pde_loss + beta * boundary_loss + beta_sensor * sensor_loss
        )

    trainable_vars = self.trainable_variables
    self.gradients = tape.gradient(total_loss, trainable_vars)
    self.optimizer.apply_gradients(zip(self.gradients, trainable_vars))

    return {
        "loss_pde": total_pde_loss,
        "loss_dirichlet": boundary_loss,
        "loss": total_loss,
        "inverse_params": self.inverse_params_dict,
        "sensor_loss": sensor_loss,
    }