python - statsmodel.api.Logit: valueerror array must not contain infs or nans -
i trying apply logistic regression in python using statsmodel.api.logit. running error valueerror: array must not contain infs or nans.
when executing with:
data['intercept'] = 1.0 train_cols = data.columns[1:] logit = sm.logit(data['admit'], data[train_cols]) result = logit.fit(start_params=none, method='bfgs', maxiter=20, full_output=1, disp=1, callback=none)
the data contains more 15000 columns , 2000 rows. data['admit'] target value , data[train_cols] list of features. can please give me hints fix problem?
by default, logit
not check data un-processable infinitities (np.inf
) or nans (np.nan
). in pandas, latter signifies missing entry.
to ignore rows missing data , proceed rest, use missing='drop'
so:
sm.logit(data['admit'], data[train_cols], missing='drop')
see logit docs other options.
if not expect data contain missing entries or infinities, perhaps loaded incorrectly. @ data[data.isnull()]
see problem is. (n.b. read this see how make infs register null.)
Comments
Post a Comment