Welcome To Support Community

Pipeline Pilot

Advanced Search
Ask Search:

categrial model using R Deep Neural net


I generated several DNN model using the "learn R deep Neural net model" component and they worked fine.

I am interested in generating some Categorical DNN models but with more than two categories ( instead of active vs actives as example shows),  would very much appreciate your help  on this.  ( did not find the option for number of category, also, should I pre-define/separate each sets or are there any way to automatically separate them based certain property ( e.g. IC50)?
Many thanks!


CraigCraig (Accelrys) 
An answer courtesy of our Machine Learning expert in development:

"Our DNN learner, like pretty much all our learners other than PCA, is a *supervised* method. So it won't tell you what the categories *should* be based on something like raw IC50 data. However, if you tell it what the categories are for the input training data, there's nothing special you need to do for the non-binary case. Just feed in the data as you would for a binary problem, being sure that TypeOfPropertyToLearn is set to Categorical.
See attached protocol in which the categories are 0, 1, 2, and 3. (Note that with unbalanced data, the model tends to always predict the majority class; the filter is designed to create a more balanced training set. But even so, the model predicts only class 0, 1, or 3 for the training data. But you do get probability predictions out of the model component if you want more nuanced results.)"

Please see attached an example protocol.

Craig - you might want to have Dana mention in the NN learner component that the method supports multicategory learning. This is very useful, and was previously only available in the Bayesian learner.
Just to clarify, all of the following learners support multi-category (i.e., non-binary) classification models:

Learn RP Tree Model
Learn RP Forest Model
Learn R Forest Model
Learn R Deep Neural Net Model
Learn R Linear Discriminant Model
Learn R Logistic Regression Model
Learn R Mixture Discriminant Model
Learn R Neural Net Model
Learn R Support Vector Machine Model
Learn R XGBoost Model

In the Bayesian case (Learn Categories and Learn Molecular Categories), the multi-category situation is treated internally as a collection of "1-versus-rest" binary models.
(How did I not know that?)

Dana - in that case you should make that clear in all of the help descriptions for those learners...

Do all of them support multiple class arrays in the learned property?
Among the multi-category learners, the Bayesian one is the outlier in that it models multiple independent binary categories. For any given item, you specify the categories the item belongs to in an array. But you can also model the case described below by having the array be single-valued. (E.g., see the Learn Iris Types example protocol.)

The other learners by contrast model a single categorical response property which may have multiple mutually exclusive values -- e.g., "mouse", "rat", "dog", "human" as values for a Species property. (Since multi-class support is the rule rather than the exception, it didn't occur to me to highlight it. But clearly we should be more explicit in our documentation.)
Jason DeJoannisJason DeJoannis
Attached is a pedagogical example of the Bayesian outlier. A typical record looks like this: