Python PANDAS, change one value based on another value -
i'm trying reprogram stata code python speed improvements, , pointed in direction of pandas. am, however, having hard time wrapping head around how process data.
let's want iterate on values in column head 'id.' if id matches specific number, want change 2 corresponding values firstname , lastname.
in stata looks this:
replace firstname = "matt" if id==103 replace lastname = "jones" if id==103 so replaces values in firstname correspond values of id == 103 matt.
in pandas, i'm trying this
df = read_csv("test.csv") in df['id']: if ==103: ... not sure go here. ideas?
one option use python's slicing , indexing features logically evaluate places condition holds , overwrite data there.
assuming can load data directly pandas pandas.read_csv following code might helpful you.
import pandas df = pandas.read_csv("test.csv") df.loc[df.id == 103, 'firstname'] = "matt" df.loc[df.id == 103, 'lastname'] = "jones" as mentioned in comments, can assignment both columns in 1 shot:
df.loc[df.id == 103, ['firstname', 'lastname']] = 'matt', 'jones' note you'll need pandas version 0.11 or newer make use of loc overwrite assignment operations.
another way use called chained assignment. behavior of less stable , not considered best solution, useful know about:
import pandas df = pandas.read_csv("test.csv") df['firstname'][df.id == 103] = "matt" df['lastname'][df.id == 103] = "jones"
Comments
Post a Comment