apache pig - Change the order of tuple fields -


as example, lets load 2 different files pig script

a = load 'file1' using pigstorage('\t') (     day:chararray,     month:chararray,     year:chararray,     message:chararray);  b = load 'file2' using pigstorage('\t) (     month:chararray,     day:chararray,     year:chararry,     message:chararray); 

now, notice order of fields different, if combine them 1 file c = union a, b; get...

(2,oct,2013,info invalid username) (oct,3,2013,warn stack overflow) 

if no other reason make data easier read, i'd reorder fields, both of them follow common format , have same positional notation each field.

(2,oct,2013,info invalid username) (3,oct,2013,warn stack overflow) 

this crops in few other places messages, levels, hosts, etc. it's not date fields, i'd make "prettier" around.

in weird pseudo-code, i'd looking like:

d = foreach b     reorder (month,day,year) (day,month,year); 

i haven't been able find example of trying , don't see function it. maybe it's not possible , i'm alone here, if has ideas i'd appreciate hints.

in general, not necessary in pig because can refer fields name , not worry position in record. if goal union of 2 relations, can achieve using onschema keyword:

c = union onschema a, b; 

that said, if need reorder relation, simple foreach...generate need:

d = foreach b generate day, month, year, message; 

note in example, not working tuples, working entire records. if did have tuple, though, can use totuple built-in udf need go:

describe e; e: {t: (month: chararray,day: chararray,year: chararray,message: chararray)}  f = foreach e generate totuple(t.day, t.month, t.year, t.message) t; describe f; f: {t: (day: chararray,month: chararray,year: chararray,message: chararray)} 

Comments

Popular posts from this blog

c++ - CryptStringToBinary API behavior -

c++ - Correct method for redrawing a layered window -

java.util.scanner - How to read and add only numbers to array from a text file -