machine learning - unable to apply learned model to test data in R -


i using titanic dataset kaggle , want learn simple logistic regression model.

i read in train , test data , both train$survived, train$sex, test$survived , test$sex factors.

i perform simple logistic regression sex being independent variable.

fit <- glm(formula = survived ~ sex, family = binomial) 

it seems okay me:

> fit  call:  glm(formula = survived ~ sex, family = binomial)  coefficients: (intercept)      sexmale         1.057       -2.514    degrees of freedom: 890 total (i.e. null);  889 residual null deviance:      1187  residual deviance: 917.8    aic: 921.8 

problem is, unable apply learned model test data. when following:

predict(fit, train$sex) 

i vector 891 values amount of training examples in training set.

i can't seem find information on how right.

any appreciated!

i'm posting answer correct couple points seem have gotten confused. there no predict-function such. meant page says "predict" "generic function". generic functions have fun.default method, in case of predict.*, there no default method. dispatch on basis of class of first argument. there separate pages each method , page "predict" lists several. package authors need write own predict methods new classes.

logistic regression predates machine learning paradigm, expecting "predict classes" unrealistic. fact can "response" prediction gift on software have provided 30 years ago when of taking our regression classes. 1 needs understand probabilities not 0 or 1 rather in between. if user wants set threshold , determine how many cases exceed threshold analyst decision , analysts need make transformations categories deem worthwhile.

executing: predict(fit, train$sex) expected give result long there values training set, i'm guessing perhaps meant try predict(fit, test$sex) , disappointed. if that's case should have been: predict(fit, list(sex=test$sex) ). r needs argument value can coerced dataframe, named list of values minimum requirement predict-ors.

if predict.glm gets malformed argument second argument, newdata, falls on original data argument , uses linear predictors retained in model object.


Comments

Popular posts from this blog

c++ - CryptStringToBinary API behavior -

java.util.scanner - How to read and add only numbers to array from a text file -

iphone - Three second countdown in cocos2d -