Haskell: sort and sortBy
Friday, 3rd July, 2009
A comment in the discussion on decorate-sort-undecorate in Haskell pointed out to me that my naive and dsu versions of sort by length had different type signatures: the dsu version needlessly required elements of the list to be of type Ord
:
sortByLength :: [[a]] -> [[a]] sortByLength xs = sortBy (comparing length) xs dsuSortByLength1 :: (Ord a) => [[a]] -> [[a]] dsuSortByLength1 xs = map snd (sort (zip (map length xs) xs))
My intention had been that the dsu version have the same requirements as the naive version. So where did this (Ord a)
context come from, and how do we get rid of it?
The relevant difference between the functions is the different type signatures for sort and sortBy:
> :type sort sort :: (Ord a) => [a] -> [a] > :type sortBy sortBy :: (a -> a -> Ordering) -> [a] -> [a]
sort
requires list elements to be comparable (obviously); sortBy
requires only a function that can generate an ordering from its input pair. In dsuSortByLength1
above, the elements of (zip (map length a) a)
are required to be of type Ord
. We could protest that just because (zip (map length a) a)
has to be Ord
doesn’t mean that a
has to be Ord
, but I suspect the compiler is playing on the safe side. [update: see first two comments.]
A function using sortBy
instead of sort
avoids this extra requirement in the type signature, and shows us a use case for comparing
:
sortByLength :: [[a]] -> [[a]] sortByLength xs = sortBy (comparing length) xs dsuSortByLength2 :: [[a]] -> [[a]] dsuSortByLength2 xs = map snd (sortBy (comparing fst) (zip (map length xs) xs))
Using sortBy (comparing fst)
rather than sort
allows us to simplfy the type signature.
We can extract length
to generalise the function:
sortByFunc :: (Ord a) => (a1 -> a) -> [a1] -> [a1] sortByFunc f xs = sortBy (comparing f) xs dsuByFunc :: (Ord a) => (a1 -> a) -> [a1] -> [a1] dsuByFunc f xs = map snd (sortBy (comparing fst) (zip (map f xs) xs))
The (Ord a)
context has returned! But here it is just a requirement on the output of the comparison function, which is fine.
Saturday, 4th July, 2009 at 6:02 am
The compiler wasn’t just being safe; there is a difference between the two. dsuSortByLength is comparing tuples of (length x,x); so when two elements have the same length, it will then sort them lexicographically. sortByLength, on the other hand, leaves equal-length elements in their original order, since sort/sortBy is stable.
That is:
> sortByLength [“bb”, “b”, “a”, “c”, “aa”,”cc”]
[“b”,”a”,”c”,”bb”,”aa”,”cc”]
> dsuSortByLength [“bb”, “b”, “a”, “c”, “aa”,”cc”]
[“a”,”b”,”c”,”aa”,”bb”, “cc”]
Saturday, 4th July, 2009 at 7:02 am
Jeff, thanks for your comment, that’s perfect. I’ve just tried dsuSortByLength2 and it works just like sortByLength.
dsuSortByLength1 is comparing (length x, x), so x has to be Ord as it’s part of the comparison; sortByLength a dn dsuSortByLength2 are comparing just length x, so there’s no requirement on x itself.
I get it now, thanks again.
Saturday, 4th July, 2009 at 10:24 am
You can also write dsuByFunc in a more point-free style:
dsuByFunc f = map fst . sortBy (comparing snd) . (zip `ap` map f)
Saturday, 4th July, 2009 at 10:31 am
Nicolas, thanks for your comment.
Very timely: ‘.’ (and ‘$’ and ‘&&&’ and maybe others) will be the subject of my next post!
ap appears to be monadic so I’ll be leaving that until I come to monads.