c - Which data structure I should use if the data is mostly sorted? -


i have huge amount of data (mainly of type long long) sorted (data spread in different files , in each file data in sorted format). need dump data file in sorted manner. data structure should use. thinking bst.

is there other ds should use can give me optimum performance ?

thanks arpit

using additional data structure won't help. since of data sorted , need fix occasional value, use simple array extract data, use insertion sort.

insertion sort runs in o(n) presorted data.

however, depends if can hold large enough array in memory or not depending upon input size.

update:

i wasn't clear on definition of "mostly sorted". means only few elements not in precise sorted position.

however, stated further, 'data in different files each file individually sorted', may candidate sub function call - merge in merge sort.

note merge routine, merges 2 sorted arrays. if have 10 files each of them individually sorted sure, using merge routine take o(n).

however, if have few off instances single file not sorted (on own), need use insertion sort.

update 2:

op says cannot use array because cannot know number of records in advance. using simple link list out of question, since never competes arrays (sequential vs random access time) in time complexity.

pointed out in comments, using link list idea if files individually sorted , need run on them merge procedure.

dynamically allocated arrays best, if can predict size @ point. since c++ tag used (only removed latter), going vector idea, since can re size comfortably.

otherwise, 1 option might heap sort, since call heapify first i.e. build heap (so can dynamically accommodate many elements needed) , still produce o(nlogn) complexity. still better trying use link list.


Comments

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

rewrite - Trouble with Wordpress multiple custom querystrings -