unix - CSV - remove rows in which any column is empty -


i'm playing titanic data set kaggle. i'd remove rows train.csv have empty column (i know isn't best way deal missing data, question interesting me regardless).

i'd unix-type way (using awk, sed, or grep), because i'm trying better @ tools, i'm not sure start.

example of data:

passengerid,survived,pclass,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked 1,0,3,"braund, mr. owen harris",male,22,1,0,a/5 21171,7.25,,s 2,1,1,"cumings, mrs. john bradley (florence briggs thayer)",female,38,1,0,pc 17599,71.2833,c85,c 3,1,3,"heikkinen, miss. laina",female,26,0,0,ston/o2. 3101282,7.925,,s 

in second row, cabin empty, want remove file.

note fourth column contains commas, column contained in double quotes.

aside:

i'd know how specific columns, can ask separate question if answer question doesn't me answer one.

i stick language has parser because commas inside double quotes can problematic. , easier extend compare specific columns. here example. extracts number of fields header , compare number each line decide if print or not:

import sys  import csv   open(sys.argv[1], 'r', newline='') csvfile:     csvreader = csv.reader(csvfile)     csvwriter = csv.writer(sys.stdout)     row = next(csvreader)     fields = len(row)     csvwriter.writerow(row)     row in csvreader:         l = len(list(filter(str.strip, row)))         if l < fields: continue         csvwriter.writerow(row) 

assuming code inside file name script.py, run like:

python script.py infile 

that yields:

passengerid,survived,pclass,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked 2,1,1,"cumings, mrs. john bradley (florence briggs thayer)",female,38,1,0,pc 17599,71.2833,c85,c 

Comments

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

rewrite - Trouble with Wordpress multiple custom querystrings -