What are alternatives to MLP, when you have rectangular, structured data?

Question

I have a rectangular numeric dataset, and I'm applying a multilayer perceptron to it. I'm having success, but I'm now looking to see what other architectures I can apply.

Much of deep learning seems applied to loosely-structured data -- sequences, text, images -- and everyone is having a lot of fun working with a variety of interesting models...at least when they have a problem that fits these models.

What about basic, row/column datasets. What are some of the canonical models to be used with this kind of data, apart from tweaking the layers of a basic MLP?

Images are _strongly_ structured. Neural networks can deal with structured prediction just fine. What kind structure does you data have? — Emre, Aug 10 '17 at 16:37
35 features...3 class output. Very basic...all data is numeric — Monica Heddneck, Aug 10 '17 at 16:39
You don't make it sound very structured, so why can't you use an MLP? I assume you know that each "row" (datum) is fed separately? — Emre, Aug 10 '17 at 16:41
I can use an MLP...what else can I also use? My options feel limited. — Monica Heddneck, Aug 10 '17 at 16:42
I can't say without knowing anything about the data. Share and describe some sample data if you want more specific advice. Generically you can optimize the hyperparameters, namely the optimizer, activation functions, regularizer, and shape of the network. — Emre, Aug 10 '17 at 16:46
Generally speaking, if i had a set of images, I could use a variety of models like VGG, ResNet, Tiramisu, etc etc. Generally speaking, since I have a flat rectangular dataset I can use...well, it appears only an MLP. Yes, I can optimize hyperparameters, but what else is there beyond the MLP? This is my basic question. — Monica Heddneck, Aug 10 '17 at 16:49
@MonicaHeddneck actually based on your debates I suggest using linear svm. linear svm is best for data which are linearly separable. whenever I try to fit a model I try linear models to see whether the data in the input dimension can be separated linearly (using liner models) or not. This may help you — Green Falcon, Aug 10 '17 at 17:12

score 3 · Answer 1 · answered Aug 10 '17 at 17:19

Fancy deep learning architectures mostly work by exploiting structure in your data. Permutation invariance or equivariance, temporal structure and spatial structure come to mind. Maybe some features are based on a set of the same objects? Most of the benefits come from sharing weights and learning a shared representation. Another potential benefit is the flexibility in output, although that is less relevant in your case. If you have a lot more labeled data on another task with the same features, you could pretrain a larger model on that task and then fine tune that network on your current task. For the rest it is difficult to say without showing some examples.

score 1 · Answer 2 · answered Aug 14 '17 at 13:16

Given 35 numeric features and 3 classes, perhaps first try a variant of SVM or random forest to get a handle on the data - neither of these are deep learning, but they are fast and good for benchmarking/troubleshooting:

SVMs: @Media mentions a linear support vector machine, but if your data isn't linearly separable then you can also employ a nonlinear kernel or soft margin classifier: http://scikit-learn.org/stable/modules/svm.html
Random Forests: an ensemble method fitting decision trees to your data, there are a ton of variants (bagging, boosting): https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

If you've already done that, you could use a restricted boltzmann machine to build a generative model of the data, or an unsupervised algorithm like an autoencoder or self-organising map to reduce the dimensionality of the data, then use that as the input to your MLP/SVM/softmax classifier.

Restricted Boltzmann Machines: good brief description of workflow here How to use RBM for classification? and more detail on training from Hinton here: https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf
Self-organising Maps: Good page on what: http://www.ai-junkie.com/ann/som/som1.html and implemented in tensorflow here: https://codesachin.wordpress.com/2015/11/28/self-organizing-maps-with-googles-tensorflow/ but of course you can choose the structure of your output.

There are other options out there but it might be worth finding out what sort of results you get on the faster and more explanatory algorithms before building a full CNN etc.

What are alternatives to MLP, when you have rectangular, structured data?

2 Answers2