I have a dataset with 8 continuous variables describing the behavior of a machine:
- input a,b,c for both left and right: a_left, b_left, c_left, a_right, b_right, c_right
- output o, non-linearly related to the inputs (and almost certainly also influenced by other factors)
- timestamp t
I have some environmental data (e.g. temperature) as well but these are fairly constant and I suspect not of significant influence.
Theory:
My theory is that due to performance degradation of the machine due to fouling, the value of o will be lower for the same input values.
Occasionally, the machine is cleaned to restore performance again. I do not know when and how well (i.e. to a "perfect" state vs. to some intermediate state between "perfect" and the previous state). The rate of degradation is also not known and most likely not constant.
Furthermore, a change is one of the input variables has a delayed effect on the output due to inertia.
Actions I have tried:
- plotting the data for some different relations between the input and output
- k-means clustering for some different relations between input and output (without timestamps)
Questions:
- Is it at all possible to verify my theory with this data?
- If so, what techniques are most suited to be applied? I'd prefer anything in Matlab.
