C++ string functions - Why do they often have unspecified complexity? -


i noticed that, according recent c++ iso standard, string::pop_back , string::erase (to name two, there others) member functions have unspecified complexity. reasoning behind leaving complexity choice of library coders? there implementations of knows have non-constant complexity string::pop_back?

tldr; because history, , time complexities haven't been proposed yet

the situation:

[basic.string] directly specifies few operations have complexity:

  • size() : o(1) since c++11
  • max_size() : o(1) since c++11
  • shrink_to_fit() : o(n) in c++17 (although added in c++11)
  • operator[] : o(1) since c++11
  • swap() : o(1)
  • data() : o(1) since c++11

it indirectly specifies many more (and i'm missing few here):

  • length() : equal size()
  • empty() : equal size()
  • at() : equal operator[]
  • front() : equal operator[]
  • back() : equal operator[]
  • copy() : o(n) (comes character traits)
  • assign() : o(n) (comes character traits)
  • find() , variations : o(n) (comes character traits)
  • append(char* s) : o(m) m length of s (comes character traits)

the requirement on data() here key. before c++11, underlying storage string implementation-defined. have been linked list matters, long translated c-style array when needed. after c++11, underlying implementation still platform dependent, requirement data() o(1) guaranteed storage of string in contiguous c-style array (no more lazily copying linked list). in c++17, data overloaded return non-const character array modify, , such modifications same using operator[] same, further solidified implementation (fwiw storage still implementation-dependent, there's no way satisfy time complexity requirements otherwise).

you can see c++03's performance requirement on swap, , reflects philosophy of standard long time; prefer specify behavior of object , let implementations take care of performance guarantees. gives library implementers more flexibility optimize whatever saw fit.

what's happened historically

when dig proposals added complexity wordings (like n2923: specifying complexity of size()) you'll see these complexity requirements getting added piecemeal people consider them.

heck, non-const data() added in c++17 because wasn't proposed before (link std discussion on matter (really links answer our friend howard hinnant posted on stackoverflow))

copy-on-write

int n8215, author discusses basic_string in detail, citing copy-on-write 1 of initial philosophies behind design (although didn't quite there)

on other hand, current specifications of basic_string intended allow reference counted strings copy-on-write (cow) essential.

(wikipedia, citing scott meyers agrees).

if standard going allow copy-on-write, made sense not make performance guarantees in standard because writes string either o(1) or o(n). o(n) correct, suppose, it's loose bound misleading.

in fact, daniel krügler wondered same thing in lwg 876: basic_string access operations should give stronger guarantees , cited copy-on-write vestige:

due earlier support copy-on-write, find following unnecessary limitations c++0x:
... 2. missing complexity guarantees: data() , c_str() return pointer guts, guaranteed o(1). should spelled out.


so looks it's mixture of allowing implementors flexibility, discarded idea of supporting copy-on-write strings, , lack of people thinking it.

feel free propose time complexities basic_string functions missing. standard may better off :-)


Comments

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

rewrite - Trouble with Wordpress multiple custom querystrings -