Hughes List

Hughes List

Inspired by Eric Lippert’s recent post on a data structure called a Hughes list, I thought I’d play around with it by writing my own equivalent in JavaScript.

Side note before I get started. For my usual readers who are following the progress of my JavaScript-to-native compiler (MetalScript), I’ll hopefully continue blogging about it soon. I’ve just been in the middle of an international move, along with some other projects, and haven’t had much time to work on MetalScript. My flight out of the country leaves in a few days, and then hopefully my schedule will open up and I’ll continue on it. For those who don’t know what I’m talking about but want to know more, also check out my last post where I link to a presentation on what MetalScript is, or any other posts in the category.

Getting back to the point of this post…

The key benefit I see about Hughes list structure is that it is a persistent (immutable) list that allows you to concatenate, prepend, and append to the list in O(1) time. It does so by lazily accumulating the operations to be performed until the list contents are actually needed, when it then finally performs a single O(n) pass to build the final “real” list (at least this is my interpretation of how it works).

For those who want to cut right to the chase, see this GitHub repo for the final code. It’s not a lot of code. test.mjs is the entry point, and hughes-list.mjs is the implementation of the data structure.

I suggest reading Eric’s series first for a full explanation of the underlying principles before comparing it to mine. I’m not here to explain it (which Eric has done perfectly) but rather to provide a twist and an implementation in a different programming language.

I deviated quite a lot from Eric’s implementation, in the following ways:

  • My Hughes list is built up by composing imperative procedures that each mutate a JavaScript array, rather than working with an underlying immutable SimpleList like Eric uses. This produces the same logically-immutable list structure and achieves similar performance characteristics but saves on implementing an additional SimpleList type.
  • I omitted the wrapper class. If you inspect the list in a debugger, the list will actually be the function, rather than containing the function. I wouldn’t do this in production code but it leads to a nice simple implementation here.
  • As a superficial detail, I used the traditional JavaScript function names push and unshift instead of append and push respectively.

Before jumping into the detail of the implementation, it’s probably best to take a look at how this list structure is used. These are the unit tests I have for it (see the full file for more context).

checkList(HL.empty                               , []);
checkList(HL.push(list1, 4)                      , [1, 2, 3,   4]);
checkList(HL.concat(list1, list2)                , [1, 2, 3,   4, 5, 6]);
checkList(HL.unshift(list1, 0)                   , [0,   1, 2, 3]);
checkList(HL.concat(HL.push(list1, 10), list2)   , [1, 2, 3,   10,   4, 5, 6]);
checkList(HL.concat(HL.unshift(list1, 10), list2), [10,   1, 2, 3,   4, 5, 6]);
checkList(HL.concat(list1, HL.push(list2, 10))   , [1, 2, 3,   4, 5, 6,   10]);
checkList(HL.concat(list1, HL.unshift(list2, 10)), [1, 2, 3,   10,   4, 5, 6]);

These should all look pretty obvious. If you’re not familiar with persistent data structures, the most important thing to point out here is that a call like HL.unshift(list1, 0) does not change list1, but rather returns a new list that is the like list1 but with 0 at the front. This is also good for me to highlight because I said that my implementation of Hughes list composes imperative procedures that mutate the underlying JavaScript array – so it’s worth getting your head around how this immutable list structure is implemented on top of array mutations.

My implementation of the Hughes list data structure comes out to something like the following:

export const empty = () => {};
export const single = x => xs => xs.push(x);
export const concat = (ls1, ls2) => xs => { ls1(xs); ls2(xs) };
export const push = (ls, x) => xs => { ls(xs); xs.push(x) };
export const unshift = (ls, x) => xs => { xs.push(x); ls(xs) };

The things to highlight above are:

  1. The only operation used on JavaScript arrays here is push, even in the implementation of unshift. This is the key benefit of the Hughes list. Depending on the implementation of arrays in the JavaScript engine, the native unshift on arrays may be O(n) (on an array of length n), while push is typically O(1).
  2. As previously noted, this implementation performs mutations, so rather than ls2(ls1(xs)), we do ls1(xs); ls2(xs). The former might have created a new array/list from ls1 that is passed to ls2, while the latter simply invokes both mutations one after the other on the same mutable array.
  3. You’ll notice that the implementation is backwards from Eric’s implementation, in that the definition of a list is a function that appends its items to the end of a given array, not the beginning. This again is just because appending an item to the end of an array in JavaScript is the most efficient.

I personally think this implementation still captures the essence of what makes the Hughes list structure useful, while also interplaying well with the king of JavaScript list-like structures, the array. The key effect is that it allows us to concatenate, prepend, or append items in any order with almost no cost, and let the list methods build the compound procedure such that the operations are performed in the optimal order for performance.

2 Replies to “Hughes List”

  1. I just posted an article on the original Hughes paper at https://dev.to/glebec/the-fp-papers-hughes-lists-3ada, and it occurred to me that I could turn DLists into an npm library pretty easily. I started searching to see if anyone else had already done so (surprisingly no?), and came across your post, which was very enjoyable. Using the JS array and its O(1) `push` op is a delightful and sensible reversal of the classic FP cons implementation!

    Have you considered upgrading your code to a production-ready package? Do you have any objections to my creating such a package which would explicitly include you as an author / contributor? I haven’t decided that I actually want to put this thing together, but now that I’ve seen your array-backed Hughes list, I don’t think I’d want to do it any other way in JS.

    1. Hi Gabriel. I’m glad you enjoyed my take on it :-) I have no problem at all if you want to create an npm package based on my code, and I’d be interested to see it.

      I hadn’t given a full package much thought, except that I thought a raw function might be confusing if a user is looking at it in a debugger, because there’s nothing to say that “this thing is a DList”. Also, it would make sense if it’s a production package if some basic operations are supported directly on a DList object, like `toString` and `[Symbol.iterator]`.

      Let me know how it goes :-)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.