R change one column of a data.frame to a binary vector -
this question has answer here:
- dummy variables string variable 6 answers
i have read file data.frame in r, , can see 5th column contains values separated ";". possible turn data.frame larger data.frame , expand 5th column binary vector?
> head(uinfo) v1 v2 v3 v4 v5 1 100044 1899 1 5 831;55;198;8;450;7;39;5;111 2 100054 1987 2 6 0 3 100065 1989 1 57 0 4 100080 1986 1 31 113;41;44;48;91;96;42;79;92;35 5 100086 1986 1 129 0 6 100097 1981 1 75 0
so, simpler example, if first 2 rows are:
1 100044 1899 1 5 1;2;4;7 2 100054 1987 2 6 3;8
i want get:
1 100044 1899 1 5 1 1 0 1 0 0 1 0 0 0 2 100054 1987 2 6 0 0 1 0 0 0 0 1 0 0
do have use program such python preprocessing of data, or possible apply function?
thanks
you can try concat.split.expanded
function "splitstackshape" package:
library(splitstackshape) mydf # v1 v2 v3 v4 v5 # 1 100044 1899 1 5 1;2;4;7 # 2 100054 1987 2 6 3;8 concat.split.expanded(mydf, "v5", sep=";", fill = 0) # v1 v2 v3 v4 v5 v5_1 v5_2 v5_3 v5_4 v5_5 v5_6 v5_7 v5_8 # 1 100044 1899 1 5 1;2;4;7 1 1 0 1 0 0 1 0 # 2 100054 1987 2 6 3;8 0 0 1 0 0 0 0 1
add drop = true
rid of original column.
here, "mydf" defined as:
mydf <- structure(list(v1 = c(100044l, 100054l), v2 = c(1899l, 1987l), v3 = 1:2, v4 = 5:6, v5 = c("1;2;4;7", "3;8")), .names = c("v1", "v2", "v3", "v4", "v5"), class = "data.frame", row.names = c(na, -2l))
Comments
Post a Comment