Finding a member of an infinite list

by Colin Ross on January 12, 2009

Haskell, and GHC in particular never cease to amaze me.  While thinking about writing an infinite sorted list data type, I was playing around with ghci and discovered that it allows you to find a member of a list which is infinitely far from the beginning.  Then I discovered it didn’t.

An initial misunderstanding

The expression I used was

elem 2 ([1..] ++ [2..])

since this would appear to construct an infinite list by adding an infinite list of 2s to the end of an infinite list of 1s, I was fully expecting to have to interrupt the calculation as ghci happily chewed through all the 1s.

Instead, the answer True popped up immediately.  Impressive.

Then I realised my mistake.

I had created my lists incorrectly – the correct code is in fact:

elem 2 ([1,1..] ++ [2,2..])

and ‘lo and behold, the expected chewing does indeed occur.  It did get me thinking though.

From misunderstanding to thoughtfulness

In theory, a clever enough data structure would be able to return True for a suitable equivalent of this expression.

The question is just finding what that data structure should be.  I have been thinking about infinite lists recently, after my previous post on finite lists, and in particular which of the functions defined in Data.List could be guaranteed to complete when the list is infinite.

It turns out that almost all can be converted to infinite lists, but many of the functions will require additional constraints on the elements in the list in order to ensure that they return.

Consider the elem function.  With an arbitrary infinite list, there is no guarantee that a call to elem will return.  An example of a list where it wouldn’t return is

elem 0 [1..]

We can resolve this problem by requiring that all the elements of the list are greater than or equal to some value.  In this case, we know that all elements are at least 1, so we can immediately return False.

However, how do we then deal with the following:

elem 2 $ [1,3..]

Here, the infinite list consists of all the positive odd integers.  Again, it is clear that 2 is not a member, but the current implementation of elem does not terminate.  We can get around the problem by using the knowledge that the elements of the list are sorted.  In which case we can use

head (takeWhile (<2) [1,3..]) == 2

which gets us the correct answer of True.  It also allows us to discard our previous condition – since being sorted means that the minimum element will be the head of the list.  But all is not well.

Consider now, the problem

elem 2 $ [1, 1 + 1/2, 1 + 3/4, 1 + 7/8, 1 + 15/16 etc.]

(which isn’t valid Haskell, but you get the idea).  In this case, every element of the list is less than the element we are searching for, so neither elem nor takeWhile would ever terminate.  Instead, we need to impose another constraint on the list, namely that the list is unbounded – so that we will be guaranteed that takeWhile will return.

Or will we?

Consider now the list formed in the following way:

[1,1..] ++ [2..]

This monstrosity satisfies the sorted, minimum element and unbounded constraints, but still would not return.  There are two possible remedies to this.  First, and most severe, is to require that all the elements of the list be unique.  This will resolve this problem as the [1,1..] would then be invalid.  The other is to require that there is only a finite number of each element.  This second constraint is less severe than the first, and so is preferred.

So, given that the list satisfies the following:

  • is sorted
  • is unbounded
  • has a finite number of all elements

I believe we can be guaranteed to find any given element in finite time.

From thoughtfulness to enlightenment

Assuming that we are now in a position to find whether an element is a member of a suitable constrained list, are there any ways in which we can  relax the constraints?

Well, firstly there is nothing particularly special about the ascending order – we could easily change that to descending order provided we correspondingly alter the minimum element condition to instead be a maximum element condition.

Having done this, a more general approach now presents itself – we can find a given element in an infinite list provided the following conditions hold on the list l with ordering function f :: a -> a -> Ordering

  • unbounded: there exists m such that all (==GT) $ map (f m) l
  • sorted: all (/=GT) $ zipWith f l (tail l)
  • finite counts: not $ any (isInfinite.genericLength) $ group l

where I have tried to use Haskell notation as far as possible, even though actually trying to execute the code on infinite lists would probably cause non-termination.

All that is needed now is an implementation, so that is the next step.

Bookmark and Share

No related posts.

{ 7 comments… read them below or add one }

Roman Cheplyaka January 13, 2009 at 2:09 pm

Instead, we need to impose another constraint on the list, namely that the list is unbounded – so that we will be guaranteed that takeWhile will return.
This does not guarantees that takeWhile will terminate. The problem is not that list is bound, rather that you have infinite bounded subsequence (or, equivalently, finite limit point). This is exactly what you observe with [1,1...] list, but requiring distinctness of elements is not a solution.
For example, takeWhile won’t terminate for such list:
[1, 1 + 1/2, 1 + 3/4, 1 + 7/8, 1 + 15/16 etc.]++[2,3,..]
although it seems to satisfy all your constraints.

Reply

Roman Cheplyaka January 13, 2009 at 2:13 pm

Although, if we return to “real” haskell, all those [1,1..]++[2..] lists simply do not make sense.
Then your conditions will probably work but all they guarantee is that you can’t concatenate two infinite lists.

Reply

Colin Ross January 13, 2009 at 3:01 pm

@Roman

The more I think about it, the more I am beginning to agree. It has been an interesting diversion to think about this stuff though :)

Dealing with the infinite is tricky.

Reply

maltem January 13, 2009 at 7:12 pm

The list [1,1..] ++ [2..] is not unbounded, because it’s the same list as [1,1..].

Reply

Colin Ross January 13, 2009 at 7:39 pm

@maltem

I agree that in practice, the concatenation is the same as the original, but is that also the case in theory? Is sort $ [2,2..] ++ [1..] the same as sort [2,2..]?. Intuitively, to me at least, they are different.

But I am at the limits of my maths, so I am quite ready to be stood corrected.

Reply

Henry Miller January 17, 2009 at 4:13 am

Haskel is a lazy language, so a sufficiently smart elem should be able to see that the input isn’t one list, but several concatenated lists of unknown size. It can then check the first element of each list before moving to the second element of each.

Much like a Godel number can be assigned to all real numbers (I think it is real numbers – I’m several years from that class so I might forget the correct term), and so you can count them even though they are infinite in two different ways and thus a simple count would never reach 1.2.

Reply

Colin Ross January 17, 2009 at 1:20 pm

@Henry

That is what am thinking. Basically a new data structure to model a list that keeps track of how the list is being manipulated. Trying to work out which of the common list operations could be supported is where the trickiness arises.

Reply

Leave a Comment

Previous post:

Next post: