python - SQL query for merging timestamps -


i have data in postgres table in following form:

col1   col2    col3              col4 id1          b                 c  id2    id1     timebegin         1###-##-## id2    id1     timeend           22##-##-## id3    id4     id5               id6 id6    id3     timebegin         2##-##-## id7    id3     timeend           200-3-##  id13   id8     id14              id15 id8    id9     timebegin         -2-1-1 id10   id11    id12              id13 

here 1###-##-## imply uncertainty in time (1000-01-01 1999-12-31)

and 22##-##-## imply uncertainty in time (2200-01-01 2200-12-31)

and 2##-##-## imply uncertainty in time (200-01-01 200-12-31)

and 200-3-## imply uncertainity in time (200-3-01 200-3-31)

and 20-3-## imply uncertainty in time (20-3-01 20-3-31)

and 200-3-## imply uncertainty in time (200-3-01 200-3-31)

and -200-3-## imply uncertainty in time (-200-3-31 -200-3-01)

now want merge 3 rows col1==col2 1 of following form:

col1   col2    col3              col4       timebegin      timeend id1          b                 c          1000-01-01     2200-12-31  id3    id4     id5               id6        200-01-02      200-3-31 id10   id11    id12              id13       null           null id13   id8     id14              id15       2-1-1 bc       9999-12-12 

if timeend col1==col2 not given 9999-12-12 presumed timeend

if timebegin col1==col2 not given 01-01-01 presumed timebegin

i.e. want take minimum of timebegin , maximum of timeend while merging.

is possible join operation in postgres. i.e. can write sql join query?

if can achieve desired using programming language python (in efficient manner) also: great.

i'm not convinced best way, here pair of postgres functions difficult bit of converting patterns min , max dates:

example fiddle

create function preprocesspattern(pat varchar(11), out cpat varchar(10), out neg boolean) $$ declare   y varchar(4);   m varchar(2);   d varchar(2);   int; begin   neg = false;   if left(pat, 1) = '-'     neg = true;     pat = right(pat, -1);   end if;    = position('-' in pat);    y = right('000' || left(pat, - 1), 4);   pat = right(pat, -i);   = position('-' in pat);   m = right('0' || left(pat, - 1), 2);   pat = right(pat, -i);   d = right('0' || pat, 2);   cpat = y || '-' || m || '-' || d; end; $$ language plpgsql;  create function datefromfmt(fmt varchar(10), neg boolean) returns date $$ begin   if neg     return to_date(fmt || ' bc', 'yyyy-mm-dd bc');   else     return to_date(fmt, 'yyyy-mm-dd');   end if; end; $$ language plpgsql;  create function mindatefrompattern(pat varchar(11)) returns date $$ declare     int;     neg boolean;     n varchar(10); begin     select * pat, neg preprocesspattern(pat);     = position('#' in pat);     if = 0       return datefromfmt(pat, neg);     else       n = left(pat, - 1) || right('0000-00-00', 0 - position('#' in pat) + 1);       n = replace(n, '-00', '-01');       return datefromfmt(n, neg);     end if; end; $$ language plpgsql;   create function maxdatefrompattern(pat varchar(11)) returns date $$ declare     int;     y int;     m int;     d int;     x varchar(10);     neg boolean;     res date; begin     select * pat, neg preprocesspattern(pat);     = position('#' in pat);     if = 0         return datefromfmt(pat, neg);     elsif = 1         return date '9999-12-31';         -- here down, pick next highest mask, convert min date subtract 1 day     elsif <= 6 -- add 1 year         if = 6 = 5; end if; -- skip - char         x = cast(cast(left(pat, - 1) int) + 1 varchar) || right(pat, 0 - + 1);     else       y = cast(left(pat, 4) int);       if = 7           m = cast(substr(pat, 6, 1) int) + 1;           if m = 2               m = 0;               y = y + 1;           end if;           x = left(to_char(y, 'fm0000'), 4) || '-' || to_char(m, 'fm0') || '#-##';       elsif = 9           m = cast(substr(pat, 6, 2) int) + 1;           if m > 12               m = 1;               y = y + 1;           end if;           x = left(to_char(y, 'fm0000'), 4) || '-' || to_char(m, 'fm00') || '-##';       elseif = 10           m = cast(substr(pat, 6, 2) int);           d = cast(substr(pat, 9, 1) int) + 1;           if (m = 2 , d = 3) or d = 4               m = m + 1;               d = 0;               if m > 12                   m = 1;                   y = y + 1;               end if;           end if;           x = left(to_char(y, 'fm0000'), 4) || '-' || to_char(m, 'fm00') || '-' || to_char(d, 'fm0') || '#';       end if;     end if;     -- original logic looks little silly we're preprocessing twice     res = mindatefrompattern(x) - interval '1 day';     if neg       return datefromfmt(to_char(res, 'yyyy-mm-dd'), neg);     else       return res;     end if; end; $$ language  plpgsql; 

this has gone through enough iterations use refactoring


Comments

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

rewrite - Trouble with Wordpress multiple custom querystrings -