OTOH, that does not completely address my difficulties. Because there are fall-back algorithms to consider. And so it seems to me that the slower version of Mergesort that is both stable and in-place, still has practical value.
You mention that fast selection typically uses Quickselect, since the known linear-time selection algorithms have much worse average-case performance. That's true; however, e.g., in Musser's 1997 paper where he introduces Introsort (log-linear-time Quicksort variant), he also recommends doing selection by starting with Quickselect, then switching to a linear-time algorithm if the recursion depth exceeds some threshold. The result is a linear-time algorithm with the same average-case performance as Quickselect.
As far as I know, Musser's idea is the standard selection algorithm in use today. Or if it isn't, then it probably should be. [1]
Similarly, I have no problem with C++ std::stable_sort using linear additional space. But if it can't, then why not fall back to another log-linear-time algorithm, instead of the O(n log n log n) that is allowed for in the standard? This stuff has been known for decades. It wasn't taken into account in the 1998 standard, and it still wasn't taken into account in the 2011 standard. Why not?
I kinda feel like I'm beating a dead horse here ... but in any case, there are still unanswered questions.
[1] EDIT: Checked the C++11 standard again. The spec for std::nth_element has not been updated to reflect Musser's idea; it's still "Complexity: Linear on average." OTOH, the spec for std::sort has been updated. In 1998 it was "Approximately N log N ... comparisons on the average," while in 2011 it was simply "O(N log (N)) ... comparisons."
OTOH, that does not completely address my difficulties. Because there are fall-back algorithms to consider. And so it seems to me that the slower version of Mergesort that is both stable and in-place, still has practical value.
You mention that fast selection typically uses Quickselect, since the known linear-time selection algorithms have much worse average-case performance. That's true; however, e.g., in Musser's 1997 paper where he introduces Introsort (log-linear-time Quicksort variant), he also recommends doing selection by starting with Quickselect, then switching to a linear-time algorithm if the recursion depth exceeds some threshold. The result is a linear-time algorithm with the same average-case performance as Quickselect.
As far as I know, Musser's idea is the standard selection algorithm in use today. Or if it isn't, then it probably should be. [1]
Similarly, I have no problem with C++ std::stable_sort using linear additional space. But if it can't, then why not fall back to another log-linear-time algorithm, instead of the O(n log n log n) that is allowed for in the standard? This stuff has been known for decades. It wasn't taken into account in the 1998 standard, and it still wasn't taken into account in the 2011 standard. Why not?
I kinda feel like I'm beating a dead horse here ... but in any case, there are still unanswered questions.
[1] EDIT: Checked the C++11 standard again. The spec for std::nth_element has not been updated to reflect Musser's idea; it's still "Complexity: Linear on average." OTOH, the spec for std::sort has been updated. In 1998 it was "Approximately N log N ... comparisons on the average," while in 2011 it was simply "O(N log (N)) ... comparisons."