python - statsmodel.api.Logit: valueerror array must not contain infs or nans -


i trying apply logistic regression in python using statsmodel.api.logit. running error valueerror: array must not contain infs or nans.

when executing with:

data['intercept'] = 1.0 train_cols = data.columns[1:] logit = sm.logit(data['admit'], data[train_cols]) result = logit.fit(start_params=none, method='bfgs', maxiter=20, full_output=1, disp=1, callback=none) 

the data contains more 15000 columns , 2000 rows. data['admit'] target value , data[train_cols] list of features. can please give me hints fix problem?

by default, logit not check data un-processable infinitities (np.inf) or nans (np.nan). in pandas, latter signifies missing entry.

to ignore rows missing data , proceed rest, use missing='drop' so:

sm.logit(data['admit'], data[train_cols], missing='drop') 

see logit docs other options.

if not expect data contain missing entries or infinities, perhaps loaded incorrectly. @ data[data.isnull()] see problem is. (n.b. read this see how make infs register null.)


Comments

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

rewrite - Trouble with Wordpress multiple custom querystrings -