Data Driven Models
- A system model can be developed from data describing the system
- Computational techniques can be used to fit data to a model
Modelling Approaches
White Box
- A white box model is a physical modelling approach, used where all the information about a system and its components is known.
- For example: "What is the voltage accross a 10 resistor?"
- The value of the resistor is known, so a mathematical model can be developed using knowledge of physics (Ohm's law in this case)
- The model is then tested against data gathered from the system
Grey Box
- A grey box model is similar to white box, except where some physical parameters are unknown
- A model is developed using known physical properties, except some parameters are left unknown
- Data is then collected from testing and used to find parameted
- For example: "What is the force required to stretch this spring by mm, when the stiffness is unknown"
- Using knowledge,
- Test spring to collect data
- Find value of that best fits the data to create a model
- Final model is then tested
- Physical modelling used to get the form of the model, testing used to find unknown parameters
- This, and white box, is mostly what's been done so far
Black box
"Here is a new battery. We know nothing about it. How does it performance respond to changes in temperature?"
- Used to build models of a system where the internal operation of it is completely unknown: a "black box"
- Data is collected from testing the system
- An appropriate mathematical model is selected to fit the data
- The model is fit to the data to test how good it is
- The model is tested on new data to see how closely it models system behaviour
Modelling in Matlab
Regression
- Regression is predicting a continuous response from a set of predictor values
- eg, predict extension of a spring given force, temperature, age
- Learn a function that maps a set of predictor variables to a set of response variables
For a linear model of some data :
- and are the predictor variables from the data set
- and are the unknowns to be estimated from the data
- Polynomial models can be used for more complex data
In Matlab
% data points
x = 0:0.1:1.0;
y = 2 * x + 3;
%introduce some noise into the data
y_noise = y + 0.1*randn(11,1)';
%see the data
figure;
plot(x,y_noise);
axis([0 1 0 5])
In matlab, the polyfit
function (matlab docs) is used to fit a polynomial model of a given degree to the data.
- Inputs: x data, y data, polynomial degree
- Output: coefficients of model
P = polyfit(x,y_noise,1) % linear model
hold on;
plot(x,polyval(P,x),'r');
In the example shown, the model ended up as , which is close, but not exact due to noise introduced into the data.
Limitations
- Too complex of a model can lead to overfitting, where the model contains unwanted noise
- To overcome this:
- Use simpler model
- Collect more data