machine learning - R Random Forest Unsupervised -


i'm trying understand random forest implementing in unsupervised mode detect outliers.

here dataset using:

dataset: https://gist.github.com/k2xl/5cd9a048ae153275f9c7

if observe, there 1 row values:

xktveqax    570 12980.5 clothing store 

the amount way more other values, expecting detected in random forest output.

library(randomforest) library(ggplot2)  data_set <- read.csv("~/path/anomaly-sample.csv", header=true, as.is=true ) data_set$category = factor(data_set$category) train_all = data_set test_all = train_all #test_all = data_set[0:200,]  rf <- randomforest(train_all[,-1],importance=true,mtry=3,norm.votes=false) print(rf) predictions <- rf$votes qplot(test_all$mins.after.midnight,test_all$amount,size=predictions[,2]) results <- cbind(test_all,predictions) results <- results[sort.list(results[,5]), ] 

what trying graph outliers big circles demonstrate unusualness. doing right?


Comments

Popular posts from this blog

c++ - CryptStringToBinary API behavior -

c++ - Correct method for redrawing a layered window -

java.util.scanner - How to read and add only numbers to array from a text file -