Close this search box.

Deep Neural Network Hyperelasticity


Deep Neural Networks (DNN) is a powerful Machine Learning method that can be used for many cool applications (e.g. image recognition, image creation, and general data analysis). In this article I will show how to create a hyperelastic model using a DNN. I will show the complete Tensorflow Python code for the implementation, and I will demonstrate some of the weaknesses of the approach.

First Some Definitions

Regression analysis is a process for estimating the relationships between a dependent variable  (e.g. stress) and a set of independent variables (e.g. strain).

Regression is a supervised Machine Learning (ML) technique. The ultimate goal of the regression algorithm is to determine a best-fit predicted curve (or function) to known data.

Supervised Learning (SL) is a ML approach where input values and a desired output value are used train a model.

So you are using supervised machine learning when you calibrate a material model using MCalibration.

Traditional Hyperelasticity

In a traditional approach, known [strain,stress] data is used to calibrate the hyperelastic material model parameters. This is often performed using MCalibration, and is a supervised machine learning approach. Once this is done, the calibrated material parameters, together with a given strain value, can be used to calculate the stress. This is then used in a FE simulation at each integration point and each time point.

Deep Neural Network (DNN) Hyperelasticity

The goal of this article is to demonstrate how we can replace the hyperelastic model with a Deep Neural Network (DNN) model.

The stress for an isotropic hyperelastic model with a free energy function that depends on the invariants (I1, I2, and J) is given by: $$\boldsymbol{\sigma} = \displaystyle \frac{2}{J} \left[ \frac{\partial\Psi}{\partial I_1^*} + \frac{\partial\Psi}{\partial I_2^*} I_1^* \right] \mathbf{b}^* – \frac{2}{J} \frac{\partial\Psi}{\partial I_2^*} (\mathbf{b}^*)^2 + \left[ \frac{\partial\Psi}{\partial J} – \frac{2I_1^*}{3J} \frac{\partial\Psi}{\partial I_1^*} – \frac{4I_2^*}{3J} \frac{\partial\Psi}{\partial I_2^*} \right] \mathbf{I}.$$ To simplify the arguments I will limit the discussion here to hyperlastic models with an energy function that only depends on the first invariant (I1). In this case the stress can be calculated from: $$\boldsymbol{\sigma} = \displaystyle \frac{2}{J} \frac{\partial\Psi}{\partial I_1^*} \text{dev}[\mathbf{b}^*] + \frac{\partial \Psi}{J} \mathbf{I}.$$ There are different ways to make this suitable for a Deep Neural Network approach. One way is to let the DNN convert the first invariant (I1) to a derivative of the free energy with respect to the first invariant:

The material model workflow in a FE analysis would then be given by the following steps:

  • The material model is given the deformation gradient
  • Calculate I1 from the deformation gradient
  • Calculate \({\partial\Psi}/{\partial I_1^*}\) using the DNN
  • Calculate the stress tensor using the continuum mechanics equation above.

To further simplify the discussion, I will focus on uniaxial stresses and strains.  Using this approach the DNN simply becomes the following:

After performing some hyperparameter tuning I have determined that the following neural network structure is one of the smallest that can capture the stress-strain response of a Yeoh hyperelastic model with reasonable accuracy.

This model has a scalar input (strain), and a scalar output (stress), and 2 dense hidden internal layers each with 20 perceptrons (nodes). Each perceptron multiplies an input vector by a weights vector, then adds a bias vector, and finally applies an activation function:

In vector form the equation for the first hidden layer becomes: $$\mathbf{y}^{(1)} = g\left(\mathbf{w}_0^{(1)} + \mathbf{w}^{(1)} \mathbf{x}\right).$$The equation for the second hidden layer is: $$\mathbf{y}^{(2)} = g\left(\mathbf{w}_0^{(2)} + \mathbf{w}^{(2)} \mathbf{y}^{(1)}\right),$$and the equation for the output layer is: $$\hat{\mathbf{y}} = g\left(\mathbf{w}_0^{(3)} + \mathbf{w}^{(3)} \mathbf{y}^{(2)}\right).$$

In the hidden layers I used the RELU activation function (the ramp function), and in the output layer I used a linear (pass-through) function. The follow Tensorflow Python code solves the whole problem. In this example I used 5,000 [strain, stress] points to train the DNN, and I fit the weight factors using the NADAM algorithm using 500 iterations (epochs).

					import numpy as np
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from matplotlib import pyplot as plt
def yeoh_uniax(params):
    lam = np.exp(params[0])
    c10 = params[1]
    c20 = params[2]
    c30 = params[3]
    i1 = lam * lam + 2.0 / lam
    return 2.0 * (c10 + 2.0 * c20 * (i1 - 3.0) + 3.0 * c30 * (i1 - 3.0) ** 2) * (lam ** 2 - 1.0 / lam)
def create_sets(nn):
    xx = np.zeros((nn, 1))  # [strain]
    yy = np.zeros((nn, 1))  # [stress]
    for ii in range(nn):
        true_strain = np.random.uniform(0.0, 1.0)
        c10 = 0.70
        c20 = 0.10
        c30 = 0.03
        params = [true_strain, c10, c20, c30]
        stress = yeoh_uniax(params)
        xx[ii, :] = true_strain
        yy[ii, :] = stress
        return xx, yy
train_x, train_y = create_sets(5000)
model = Sequential()
model.add(Dense(20, activation='relu', kernel_initializer='he_normal', input_dim=1))
model.add(Dense(20, activation='relu', kernel_initializer='he_normal'))
model.add(Dense(1, activation='linear',  kernel_initializer='he_normal'))
model.compile(optimizer='nadam', loss='mse', metrics=['mean_absolute_error']), train_y, epochs=500, batch_size=50, verbose=1)
# evaluate model
npts = 40
strain_vec = np.linspace(0.0, 1.0, npts)
stress_vec_pred = np.zeros(npts)
stress_vec_real = np.zeros(npts)
c10 = 0.70
c20 = 0.10
c30 = 0.03
for i, val in enumerate(strain_vec):
    xin = np.array([val]).reshape(1, 1)
    stress_vec_pred[i] = model.predict(xin, verbose=0).flatten()
    stress_vec_real[i] = yeoh_uniax([val, c10, c20, c30])
plt.plot(strain_vec, stress_vec_pred, c='r', label="ML")
plt.plot(strain_vec, stress_vec_real, c='b', label="Yeoh")

The following is the tail end of the output from running this code:

					Epoch 499/500
100/100 [==============================] - 0s 647us/step - loss: 0.0193 - mean_absolute_error: 0.1065
Epoch 500/500
100/100 [==============================] - 0s 635us/step - loss: 0.0197 - mean_absolute_error: 0.1073
Model: "sequential"
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 20)                40        
 dense_1 (Dense)             (None, 20)                420       
 dense_2 (Dense)             (None, 1)                 21        


And the following image is created:

These results show that it is possible to represent a Yeoh hyperelastic model using a Deep Neural Network (DNN) model. By increasing the number of perceptrons (nodes) the accuracy can be further improved.

Number of Floating Point Operations

The traditional implementation of the Yeoh model for a 1D case requires about 14 floating point operations (FLOPS) for each stress update. Since a trained DNN model predictions requires a fixed number matrix multiplications and vector additions, it possible to calculate the number of FLOPS also for the DNN. The network that I selected for this example requires about 901 FLOPS for a stress update. In other words, the DNN is more than 50 times slower than the traditional approach.


  • It is simple to create and train a Deep Neural Network (DNN) based hyperelastic model.
  • The DNN can be as accurate as traditional hyperelastic models.
  • The DNN will be much slower than traditional hyperelastic models and is therefore not a practical solution.
  • It is important to use the right tool for the right problem!

More to explore

2 thoughts on “Deep Neural Network Hyperelasticity”

  1. A true story. 10 years ago, I presented Jorgen’s TN model to our company’s CTO (a world 100 manufacturing company), who is member of French academy of science and a renowned scientist. He said something like “with 20 some parameters in the model, you can fit anything”. He felt a bit better when I explained that there were like 5 different physics behind it.
    Now Image to present him with DNN.

  2. That is a really interesting story. DNN does not have much physics behind it, but perhaps all the attention it gets somehow makes it more acceptable. Thanks for sharing!

Leave a Comment