Sean Foy

Wednesday, February 8, 2012

Book Review: The Name of the Rose

This is a review of The Name of the Rose by Umberto Eco.

This novel is set in an Italian monastery in 1327. The protagonist Adso of Melk is a novice accompanying his master William of Baskerville to investigate a murder. It's fun as a serial murder mystery, and it's interesting to think about the historical and ethical issues that are raised concerning knowledge, secrecy, reasoning, belief, etc.

William of Baskerville is himself a disciple of William of Occam, and at times his reasoning seems almost Bayesian: he says, "reasoning about causes and effects is a very difficult thing," and then explains about the usefulness of conditional independence in simplifying inference.

At one point William tells an amusing parable reminiscent of a recent BBC article: "[i]f you wish to keep a place clean here, to prevent anyone from pissing on it, which t he Italians do as freely as dogs do, you paint on it an image of Saint Anthony with a wooden tip, and this will drive away those about to piss."

I gather from the introduction that the author, a semiotician, embedded a rich collection of references to more or less subtle points of Catholic theology that I missed but that others may enjoy.

Some descriptive passages are a bit florid but on the whole I enjoyed this book.

Saturday, December 31, 2011

Book Review: The House of Wittgenstein

This is a review of The House of Wittgenstein: A Family at War by Alexander Waugh.

This book traces the lives of four generations from the late nineteenth century to the early twenty-first. During this time: a great fortune was made in trade, real estate, and industry; a pianist lost his right arm but continued to perform at a world-class level of proficiency; a philosopher rose to international prominence; the family evaded Nazi persecution.

Some of the most interesting material for me concerns war time. For example, Paul was taken as a POW by the Russians during The Great War and was held at the krepostnaia katorga at Omsk. He was allowed to send letters home, but his letters do not describe the conditions for which the prison camp was infamous. Waugh notes that the Russians censored outbound letters and anyway many prisoners wanted to avoid worrying their families or felt shame for having been captured. Interestingly, the Austro-Hungarian Kriegsüberwachungsamt was censoring inbound mail:

Letters have been received lately from our prisoners of war in enemy countries. In some of these letters the writers describe life in captivity in a very favorable light. The spreading of such news among the troops and recruits is undesirable. The military censors are therefore to be instructed that such letters of our prisoners of war as may, by their contents, exercise an injurious influence are to be confiscated and not to be delivered to their addressees. [p. 75]

The Austro-Hungarian government funded the war by printing money and consequently there was severe inflation during the twenties; "by August 1922 paper money was virtually worthless as consumer prices rose to a level 14,000 times higher than they had been before the war" [p. 128]. Trade embargoes further hampered commerce with the result that 96% of Austrian children were malnourished [ibid]. For the Wittgensteins, this led to conflicts between siblings about philanthropic and investment decisions.

It is pretty interesting to see how the Wittgenstein's understanding of the Nazis developed during WWII. Initially they supported the Heimwehr austrofascists against the Nazis in opposition of Anschluss, socialism, "exaggerated racial theories," and "a semi-pagan German national religion". When the Germans invaded, the Wittgensteins were frustrated in their attempt to sell some art abroad and annoyed by the new administration's insistence that they fly the Nazi flag over their palais, but they did not fear for their personal safety until later. They seem to have underestimated the sincerity of Nazi leaders with respect to racial rhetoric. In fact, their immense wealth and political connections outside the Reich did save them from the concentration camps even as strategic and tactical disagreements irreparably inflamed tensions among siblings.

This extraordinary group of people during an extraordinary period of time makes for an engaging narrative. It is competently written and my impression is that it is well-researched, but I certainly read it with variable velocity and interest. I recommend this book to those who are interested in the economic and political history of early twentieth century Europe, and curious about seeing those events through the perspective of one prominent family.

Wednesday, November 30, 2011

Book Review: The Annotated Turing

This is a review of The Annotated Turing by Charles Petzold.

The heart of the book is Turing's On Computable Numbers, with an Application to the Entscheidungsproblem, segmented and the segments interleaved with commentary and corrections from Petzold and Turing's own erratum. It is prefaced with some background material that will be a good refresher or even introduction for non-mathematicians. The end of the book draws connections between Turing and his/our contemporaries who carried forward his work.

Before this, my exposure to the Church-Turing Thesis had always been indirect, via the later works of Kleene and others. It was interesting to see some of the differences in the original presentation of these ideas. For example, I didn't realize or remember that the Halting Problem was introduced by Martin Davis; Turing himself discussed mainly "circle-free" machines that would never halt. I also liked that Petzold gives a rounder portrait of Turing by the inclusion of material on Turing's personal history.

I believe this book would be accessible to readers with no prior cs background. Check it out if you are interested in either computer science or the philosophy of mathematics.

Monday, October 31, 2011

Book Review: The Power Broker: Robert Moses and the Fall of New York

Robert Moses was smart and got things done. He showed extraordinary idealism in his early adulthood, and worked hard to apply his wealth, privilege, and considerable talents in hopes of making the world a better place. Over his lifetime, he mastered realpolitik and wielded considerable power; from his relatively modest nominal appointments, he dominated not only mayors of New York City but also the statehouse and in one conflict a sitting President of the United States. The legacy of his most effective decades using billions of dollars to remake the neighborhoods of New York City and establish the pattern of development for Long Island will persist for centuries, to the chagrin of many.

Fittingly, this is a monumental biography with 1162 pages of main matter and followed by 84 pages of notes and references. It is occasionally indulgent, but on the whole I think it was well-edited. Moses was deeply flawed and immensely capable and driven; it is interesting to learn about how he amassed, used, and lost influence over his lifetime.

Sunday, January 2, 2011

Book Review: The Time Traveler's Guide to Medieval England

This is a book review of The Time Traveler's Guide to Medieval England by Ian Mortimer.

History is more fun without the politics

Context is a vital consideration for historical interpretation, but ask a history student to summarize what he's learned and you'll likely hear exclusively about famous people and events. Some of these people were colorful characters, and many of the events make for good stories. But this book focuses on the complement of that history, highlighting the crucial but normally subordinated milieu of an age.

Mortimer's guidebook tells of the customs, etiquette, food, clothing, laws, and economic conditions in which the 14th century English lived their lives. The book is written in a light-hearted style; it's a very entertaining way to cure some of your misconceptions and enrich your understanding of the period. Recommended.

Saturday, January 1, 2011

Book Review: The Ghost Map

This is a review of The Ghost Map by Steven Johnson.

Johnson's account of John Snow, Henry Whitehead, and other key figures in the story of a nineteenth century London cholera epidemic is an engaging narrative with insights about scientific investigation and persuasive argument. Starting with an orientation to London c. 1854, the heart of the book explains how a particularly perspicacious investigator developed and tested hypotheses to reach correct and influential conclusions. Along the way we see how somewhat different approaches led others to different outcomes. Of course, the process of scientific discovery remains relevant to us in the 21st century, as are many of the issues concerning urban design and public policy.

The Broad Street cholera epidemic of 1854 was in large part a consequence of urbanization with inadequate public works infrastructure, at a very early stage in the development of medical science and microbiology. Particularly in the last chapter, "Broad Street Revisited," Johnson discusses the relation between 1854 London and present-day first-world city planning. Citing Jane Jacobs and other contemporary sources, he discusses the architecture of cities, the role of government organizations such as the CDC, and even the War on Terror and nuclear non-proliferation. I think he overreached in this attempt to reinforce the continuing relevance of the Snow/Whitehead collaboration and to advocate for positions on tenuously connected modern issues.

Readers of Tufte's Visual Explanations will recognize this epidemic from several pages in that book (it was also mentioned in The Visual Display of Quantitative Information). While Tufte focuses on an analysis of Snow's compelling maps as designed artifacts, Johnson describes in detail the scientific inquiry that led to the development of those maps. Johnson also offers a short critique of Tufte's treatment. It is disappointing that after describing Snow's second-edition Voronoi-enhanced map (see also) as "Snow's most significant contribution to the field of disease mapping" and calling out Tufte on having overlooked it, Johnson omits the diagram from his own book.

The Ghost Map includes end notes, but the note numbers appear only in the notes and not in the text. The absence of a visual indication that more detail is available makes the end notes less likely to be used and made me less trusting about some of the author's assertions.

I have pointed out some weaknesses in The Ghost Map, but on the whole I think it was well worth reading. If you are interested in Snow's maps, the history of epidemiology, or the process of scientific inquiry, I heartily recommend this book.

Thursday, April 22, 2010

Palindromic Pangrams

This post details a solution to a disused hiring problem. If you are a graph search wiz, you will find yourself skimming; I assume readers either haven't done this before or have forgotten it mostly. Even experts might be interested in the pathology, so hopefully there's something here for everyone to enjoy.

I've been unusually quiet lately because I'm in the midst of a truly marathon interview cycle. The things I'm working on probably aren't interesting to other people, and they're taking up all of my time. But before this began I spent some time working on a hiring puzzle that I think is blog-worthy.

I couldn't share the details of my approach at the time that I was actively working on the problem, because that would hurt the company and other candidates by raising the question of plagiarism. In fact, ITA removed the problem I was working on from their web site, and when I wrote their recruiter to ask why, I found out it was because somebody else wrote about it! Now it seems the rumor that Google is acquiring ITA is confirmed and some people are talking about other hiring problems. So I'm doubly sure it's OK to post this now.

First, the problem specification:

A palindrome is a sequence of words like "lid off a daffodil" or "shallot ayatollahs" that uses the same letters reading backwards as forwards. The words need not form a meaningful or grammatical sentence. A palindromic pangram is a multi-word palindrome that includes all 26 letters of the alphabet. Find the shortest sequence of words that is both a pangram and a palindrome.

That sounded like something that could occupy me for a few hours. Of course ITA would have chosen the dictionary so that brute-force approach wouldn't be fast enough, but with a reasonable choice of data structures and algorithms, it shouldn't be too hard. Writing up my solution would require lots of care; I imagined it would be carefully code reviewed to see if I was worth talking to, in which case I'd spend most of the interview defending my solution and talking about how it could be improved. Maybe several days to edit for clarity and efficiency and so on. I wound up spending weeks trying with some success to make it run faster. Besides learning some python, some humility and having an excuse to actually do things I'd previously only read about, I learned this lesson: the point of solving a puzzle is to GET A PHONE SCREEN. That's all. It was fun, but I won't repeat the mistake of undertaking substantial work without feedback.

If you want to play along at home, pause reading here and resume when you're happy with your own solution to compare notes.

Enough introduction. Let me show you some code along with commentary based on my notes. I started with a little brainstorming. I'll give you the main points first and then give a more thorough discussion later.

A quick note on optimization terminology: an optimization problem is normally posed with a set of constraints and an objective function. For example, the constraints for this problem specify that we are only interested in pangrams, and we are only interested in palindromes. We call solutions that satisfy the constraints feasible. The objective function grades the goodness of feasible solutions. For example, in this case our objective function is phrase length. An optimal solution is a feasible solution for which no other feasible solution is scored better by the objective function.

First observation: For each letter, some word in our pangram must contain that letter.

First idea: beam stack search word-sets.

Elaboration: Here I'm considering pangram formation as a search problem in the space of words organized by letter inclusion. Each dictionary word is a child of the empty root node. Each node has the complement of dictionary words represented by its (improper) ancestors as children. Assuming a small constant bound c on word length, classifying words will cost |words| * 26 * c = O(n). Among sibling nodes, I can prioritize words containing the rarest letter in order to avoid unfruitful branches. Even so, the six-figure branching factor gives a very big search space.

Second observation: For each prefix p of the letter-sequence of the pangram, the reverse of p must exist in some concatenation of words.

Second idea: use a trie for reversed words to help prune the search for palindromes. The overall search tree has min(n, 26^c)! nodes, so let's only build tries for feasible word-sets.

We need a formal definition of our goal:


def palindromeP(seq):
   """ Characteristic fn for palindromes.

   >>> palindromeP(['a'])
   True

   >>> palindromeP(['a', 'b', 'a'])
   True

   >>> palindromeP(['ab', 'c', 'ba'])
   True

   >>> palindromeP(['ab', 'c', 'ab'])
   False
   """
   return (lambda p: p[:int(len(p)/2)] == (p[int(math.ceil(len(p)/2.0)):])[::-1])(''.join(seq))

For those who don't use python, the triple-quotes delimit a docstring comment, and what looks like a toplevel transcript is a doctest specification showing inputs and expected outputs. The code just says that we are concerned with the concatenation of the given sequence: when the first half matches the reverse of the second half, we have a palindrome.

The definition of pangram depends on the letters that comprise our candidate solution:


class Histogram:
   def __init__(self, seq):
       self.h = collections.defaultdict(set)
       for s in seq:
           for i in s:
               k = i.lower()
               self.h[k].add(s)
   def __len__(self):
       return len(self.h)
   def __getitem__(self, key):
       return self.h[key]
   def __iter__(self):
       if (not hasattr(self, 'ordered_h')):
           self.ordered_h = sorted(self.h.items(), key = lambda xy: -len(xy[1]))
       return self.ordered_h.__iter__()
   def top(self):
       for x in self.__iter__():
           return x[1]
   def __repr__(self):
       return '(Histogram %s)' % self.h.__repr__()

The histogram is a glorified dictionary mapping letters to words. It participates in the python idiomatic iteration scheme by partially ordering its values by overall letter frequency.


def pangramP(seq):
   """
   Characteristic fn for pangrams.
   (assumes the alphabet is limited to A-Za-z)

   >>> pangramP([''])
   False

   >>> pangramP(['a'])
   False

   >>> pangramP([string.ascii_lowercase])
   True

   >>> pangramP(dict((x, x) for x in string.ascii_lowercase))
   True
   """
   if isinstance(seq, collections.KeysView):
       h = seq
   elif isinstance(seq, str):
       h = set(seq)
   elif isinstance(seq, collections.Mapping):
       h = seq.keys()
   elif isinstance(seq, Histogram):
       h = seq
   else:
       h = Histogram(seq)
   return len(h) == 26

Now, let's look at the search implementation. I've characterized the problem in terms of tree search, so I need a representation for nodes:


class WordSetNode:
   def __init__(self, word, parent = None):
       if not hasattr(word, '__iter__'):
           word = str(word)
       self.word = word
       self.parent = parent
       self.covered = dict.fromkeys(word)
       if parent:
           self.covered.update(self.parent.covered)
   def coveredP(self, letter):
       return letter in self.covered
   def pangramP(self):
       return pangramP(self.covered.keys())
   def ancestors(self):
       n = self
       while (n):
           yield n
           n = n.parent
   def phrase(self):
       return reversed([n.word for n in self.ancestors() if n.word])
   def phrase_string_ns(self):
       if not hasattr(self, 'phrase_string_ns_cache'):
           self.phrase_string_ns_cache = ''.join(self.phrase())
       return self.phrase_string_ns_cache
   def __repr__(self):
       if not self.parent:
           return "WordSetNode('%s')" % self.word
       return "WordSetNode('%s', %s)" % (self.word, repr(self.parent))

The first thing you might notice is that I have backlinks from children to parents, and a very odd thing you'll notice is that I have no edges from parents to children! I'm not planning to build a tree and then search it for solutions. I want to build my tree on-demand in order to avoid needless computation and limit my memory footprint. Basically, I move from parents to children by generating nodes rather than by following a path through some pre-existing structure. The backlinks are there for convenience.

As in the sketch above, the tree search is a quest for pangrams having the palindrome property, so nodes are viewed as contributing letter-coverage and paths through the tree represent words in the overall pangram. I cache information about the path to each node along the way.

So, about those pangrams: my idea for pruning infeasible word-sequences involved prefix trees ("tries"). This is the first time I've used them in real life, so I will remind you that prefix trees are all about paths; the element data aren't associated with particular nodes but rather with paths from the root. It's very efficient for queries of the form: which elements have x as a prefix? In this case, my trie nodes will represent letters; the trie itself represents a collection of letter-sequences:


class Trie:
   def __init__(self, seq = []):
       if type(seq) == type(''):
           seq = [seq]
       self.terminus = False
       self.children = {}
       for i in seq:
           self.add(i)
   def emptyP(self):
       return not self.children
   def match(self, word):
       """Find the deepest subtrie representing
       a prefix of word, and the remaining suffix.

       >>> Trie().match('')
       (Trie(), '')

       >>> Trie().match('abc')
       (Trie(), 'abc')

       >>> Trie('abc').match('')
       (Trie(['abc']), '')

       >>> Trie('abc').match('a')
       (Trie(['bc']), '')

       >>> Trie('abc').match('aac')
       (Trie(['bc']), 'ac')

       >>> Trie(['abc', 'aac']).match('achtung')
       (Trie(['ac', 'bc']), 'chtung')

       >>> Trie(['abc', 'aac']).match('aardvark')
       (Trie(['c']), 'rdvark')
       """

       if not word:
           return (self, word)
       if not word[0] in self.children:
           return (self, word)
       return self.children[word[0]].match(word[1:])
   def add(self, word):
       if not word:
           self.terminus = True
           return
       subtree, remainder = self.match(word)
       if not remainder:
           # mark the word if it's not already marked
           return subtree.add('')
       subtree.children[remainder[0]] = Trie([remainder[1:]])
   def search(self, w):
       quarry, remnant = self.match(w)
       if remnant:
           return None
       return quarry
   def traverse(self, preorderVisit, postorderVisit, prefix = ''):
       preorderVisit(self, prefix)
       for c in self.children.items():
           c[1].traverse(preorderVisit, postorderVisit, prefix + c[0])
       postorderVisit(self, prefix)
   def pre_order_nodes(self, ancestry = None):
       if ancestry is None:
           ancestry = []
       yield (self)
       for c in self.children.items():
           for i in c[1].pre_order_nodes():
               yield i
   def words(self, prefix = ''):
       if self.terminus:
           yield prefix
       for c in self.children.items():
           for l in c[1].words(prefix + c[0]):
               yield l
   def __repr__(self):
       W = ', '.join(repr(w) for w in sorted(self.words()))
       if W:
           return "Trie([%s])" % W
       else:
           return 'Trie()'

a trie representing {a, daffodil, lid, lug, off, offset}

As I explore my overall search tree, building up a sequence of words, I want to ensure that that my sequence has palindromic symmetry. I can do this by querying a trie-filled-with-reverse-words for my current search tree path.

The basic idea seems OK, but I haven't addressed word segmentation within the pangram. We disregard whitespace in the problem definition, so we can't limit our search to pangrams with word-length symmetry. We've got to account for "massive … sam" wherein we could reach a terminal node of the prefix trie of reversed words before exhausting letters in the current search path.


def loopyTrieSearch(t):
   """Produce a function that searches
   a trie augmented with backedges from
   each leaf to the root.
   >>> t = Trie()
   >>> s = loopyTrieSearch(t)
   >>> s(t, 'unmatchable')
   >>> t = Trie(['abc', 'aac', 'b'])
   >>> s = loopyTrieSearch(t)
   >>> s(t, '')
   [Trie(['aac', 'abc', 'b'])]

   >>> s(t, 'a')
   [Trie(['ac', 'bc'])]

   >>> s(t, 'ba')
   [Trie(['ac', 'bc'])]

   >>> s(functools.reduce(Trie.merge, s(t, 'b')), 'ab')
   [Trie(['c'])]

   >>> s(t, 'az')
   >>> t = Trie(['ab', 'abc', 'd'])
   >>> s = loopyTrieSearch(t)
   >>> s(t, 'abd')
   [Trie([''])]

   >>> s(t, 'azzz')
   """
   def s(t2, w):
       ctx = t2 if isinstance(t2, collections.MutableSequence) else [t2]
       if w == '':
           return ctx
       for l in w:
           for i in ctx:
               if i.terminus:
                   ctx.append(t)
                   break
           ctx = [j[0] for j in (i.match(l) for i in ctx) if j[1] == '']
       if not ctx:
           return None
       return ctx
   return s

From these ideas, I built a program that ran very nicely on a contrived, tiny, dictionary, but when I ran it on the 200kword ITA dictionary, my cpu fan revved to 100% and stayed there indefinitely. I should show some profiling data here.

How could I build my search tree more cheaply? Initially, to generate a node's children, I would take my letter-coverage dictionary's complement to see what letters I needed. I'd find all the words that could contribute a needed letter. For each word, I'd use my loopyTrieSearch. This tactic does avoid exploring infeasible branches, but it involves a lot of unsuccessful trie queries. What if I started by loopyTrieSearching the prospective parent's path, and then considering the loopy suffixes as possible words containing a needed letter?

Consider a trie pair (A, B) with A containing "awk" and "awkward" and B containing only "awesome." The naive approach requires us to conduct separate trie searches for "awk" and "awkward," both times stopping when we fail to match the ks. The improved approach allows us to simultaneously eliminate both words as soon as we find that "aw" has no child "k." If the average branching factor in the trie over all words (26 at most) is less than the average number of words associated with each letter (200000/26), then the latter could be a superior tactic. This is the purpose of the Trie.filter method:

class Trie: # ... def filter(self, criteria, root = None): """Return the words in this trie that match the augmentation of criteria by the addition of backedges from terminals to the root. >>> sorted(w for w in d if s(T, w)) == [w for w, i in itertools.groupby(sorted(FT.filter(T)))] True """ def f(words, crit, prefix = ''): if words.terminus: yield prefix if words.emptyP(): return backedge = root.children.items() if crit.terminus else () for c in itertools.chain(crit.children.items(), backedge): m = words.match(c[0]) if m[1] != '': continue for w in f(m[0], c[1], prefix + c[0]): yield w if not root: root = criteria for i in f(self, criteria): yield i @staticmethod def merge(a, b): """Merge two tries. >>> Trie.merge(Trie(['a']), Trie(['b'])) Trie(['a', 'b']) >>> Trie.merge(Trie(['abc']), Trie(['bcd'])) Trie(['abc', 'bcd']) >>> Trie.merge(Trie(['abc', 'aac']), Trie(['abe'])) Trie(['aac', 'abc', 'abe']) """ result = Trie() result.children = a.children.copy() for k, v in b.children.items(): if k in result.children: result.children[k] = Trie.merge(v, result.children[k]) else: result.children[k] = v result.terminus = a.terminus or b.terminus return result

For this, I'll use a pair of tries, one reverse and one forward. My search path corresponds to paths in each of those tries; as I query in the trie pair I need never pursue a branch that's available in only one of the tries.


class TriePair:
   def __init__(self, W, Wr = None):
       if not Wr: Wr = W
       self.fwd = W if isinstance(W, Trie) else Trie(W)
       self.rev = Wr if isinstance(Wr, Trie) else Trie(w[::-1] for w in Wr)
       self.s = loopyTrieSearch(self.rev)
   def expand(self, node, advisoryBudget = None, peek = True):
       self.setupContinuation(node)
       if not node.trieContinuation: return
       if peek and node.pangramP() and not node.parent.pangramP():
           # don't peek in the recursive expansion
           for i in self.completePalindrome(node, expand = functools.partial(self.expand, peek = False)):
               yield i
       W = itertools.chain(*(self.fwd.filter(c, self.rev) for c in node.trieContinuation))
       awords = set(a.word for a in node.ancestors())
       for w in set(W):
           if w in awords: continue
           n = WordSetNode(w, node)
           # if pangram and \exists a word matching some trie
           #  path to a leaf -> generate the easy palindrome
           n.setupContinuation = self.setupContinuation
           yield n

I mentioned earlier that the search tree is constructed during the search process. The expand function does that work, producing the child nodes of each parent. I'll explain peeking later. It would be possible to query our search tries for the string corresponding to our path whenever we expand a node, but then each descendent must redo its ancestors' trie search work. So, we copy the parent's search context into a cache (trieContinuation) on each child in order to avoid that extra work:


class TriePair:
# ...
   def setupContinuation(self, n):
       if hasattr(n, 'trieContinuation'):
           return
       elif self == n and hasattr(n, 'setupContinuation'):
           n.setupContinuation(n)
       elif n.parent and hasattr(n.parent, 'trieContinuation'):
           n.trieContinuation = (
               self.s(n.parent.trieContinuation,
                      n.word))
       else:
           n.trieContinuation = (
               self.s(self.rev,
                      n.phrase_string_ns()))

Let's take a closer look at beam-stack search.

Recall that depth-first search always expands the deepest unexplored node, while breadth-first search always expands the shallowest unexplored node. Depth-first search uses memory proportional to the longest path to a leaf; breadth-first search uses memory proportional to the size of the largest layer, O(d^b) where d is the longest path to a leaf and b is the branching factor at each node. For many problems, the better of two feasible solutions is the one closer to the root, and for these problems breadth-first search finds good solutions faster than depth-first search (which can pay an infinite penalty for unlucky early expansions).

Best-first search algorithms take an intermediate course between depth-first and breadth-first search. They choose which node to expand next by applying a heuristic. A* is a well-known such algorithm, and it's the basis for beam-stack search. The exploration order (denoted f(n) = g(n) + h(n)) takes into account both the (known) cost of each node's path from the root (g(n)) and the (estimated) cost of its path to a solution (h(n)). The estimation must be conservative in the sense that it never overestimates a potential solution's cost; this allows admissible pruning: once we've found a solution, we can safely avoid expanding any subtrees with higher cost estimates.

Best-first search can degenerate into breadth-first search in the worst case for memory use; beam-stack search addresses real-world memory constraints by enabling inadmissible pruning without compromising completeness (we will find a solution if one exists) or optimality (no solution is better than the one we find). Beam-stack search does this by keeping metadata proportional to the depth of the search that allows it to backtrack, regenerating and thoroughly exploring inadmissibly pruned nodes after it has explored the more promising portion of the graph.

Note that beam-stack search is an anytime algorithm: it strives to return a feasible solution ASAP and then to return improved solutions as they discovered. It is also complete and optimal, so it will definitely return a solution if any exists and the final solution it returns is guaranteed to be at least as good as any that exist.

The following graphs illustrate the progress of a beam-stack search on a small graph with a simple cost function and heuristic.

Bounds on the f-cost of child nodes are associated with each layer. Initially, we consider all possible children. Whenever we generate a child, we check for solution feasibility.

For each node, we can compute the (certain) g-cost of the partial solution and the (guessed) h-cost of the potential solution descendent of the node. In this example, we assume that g is phrase length (not including spaces) and h is the number of letters we need to add to produce a pangram. These are sensible, if naive, assumptions for the pangramic palindrome problem. But, to keep this example small, we will cheat by using an easier feasibility predicate: we will accept any palindrome; pangrams are not required.

When memory is limited, we inadmissibly discard some children, updating the parent layer upper bound accordingly. If backtracking proves necessary, we can use the upper bound to regenerate only the inadmissibly discarded children, thus avoiding duplicated work. Assuming a beam width of 2 nodes, we partition the set of children by f-cost. In this case, a and lid survive while off and daffodil are pruned.

We are ready to expand the frontier.

The layer exceeds our memory threshold, so it's time to prune again.

We retain the most promising nodes and update the layer bound.

We are ready to expand the frontier.

Our frontier exceeds our memory limit, but I think I've made the point about pruning so let's double the limit to 4. OK then we can expand again.

If the dictionary contains only the words {lid, off, a, daffodil} then we are unable to expand beyond this point without duplicating words. Notice that we have yet to find any palindromes. Because we cannot go forward, we must go backtrack.

We backtrack to the deepest layer at which inadmissible pruning occurred (i.e., where the upper bound is finite). There, we update the bounds to show that we have fully explored the space below the upper bound; now we will consider the space above. We are ready to expand the frontier.

Inadmissible pruning works just the same after backtracking as before. To see this, let's tighten the memory limit back to two.

As ever, we remove the nodes with high f-cost.

We are ready to expand the frontier.

This layer exceeds our memory limit, so we need to prune.

We discard unpromising nodes.

We are ready to expand the frontier.

We find a feasible solution on the frontier: lidoffadaffodil. This is a proof by example of the maximum cost of the optimal solution to our problem. Our heuristic is admissible, meaning that it never underestimates the cost of the best solution that uses a given node. Therefore, from now on, we can admissibly prune nodes whose f-cost exceed the f-cost of our candidate solution. The memory limit might still cause us to do some inadmissible prunings in the future, and in that case we might need to backtrack. But, admissible prunings never create an obligation to backtrack. If we continue from here, we will eventually exhaust the space between 27 (where we inadmissibly pruned some nodes) and 35 (where we have a solution) and determine that in fact we have an optimal solution.

So here are the parts to our cost function f:


def g(n):
   # sort of like
   #   lambda n: sum(len(w.word) for w in n.ancestors())
   # but with caching
   if not hasattr(n, 'g'):
       result = len(n.word)
       if n.parent:
           result += g(n.parent)
       n.g = result
   return n.g
def h(n):
   """Admissible heuristic estimate of cost
   remaining from this node.

   >>> h(WordSetNode('a')) == h(WordSetNode('aa'))
   True

   Though admissible, it's reasonable tight.
   >>> minimalCost = 26 * 2 - 1
   >>> inadmissibleMargin = 1
   >>> n = WordSetNode('a')
   >>> h(n) > minimalCost - inadmissibleMargin - len(''.join(n.phrase()))
   True
   """
   phrase_width = g
   tie_breaker = .01 * len(n.word) + .01 * abs(hash(n.phrase_string_ns()))/float(sys.maxsize)
   covered = len(n.covered)
   completely = len(string.ascii_lowercase)
   # The symmetry of the palindrome guarantees
   # that the first half is a pangram. So,
   # there's no use evaluating palindrome-dness
   # until we have a pangram.
   if covered < completely:
       # descendent nodes must:
       #   cover the missing letters on the left
       #     (just to the right of n)
       #   match the missing letters on the right
       #     (with a possible exception at the pivot)
       #   match what we have so far
       # TODO: improve this estimate by:
       #   (a) looking ahead in the trie for the missing
       #       letters
       #   (b) using the max_missing(min(len(word) | missing \in word))
       #       rather than 1 for the cost component due to
       #       missing letters
       missing = completely - covered
       # strategy b:
       # this doesn't help: when letterCosts
       # wins the max, we lose our ability to
       # distinguish between nodes that help
       # us gain coverage and nodes that do not.
       #needed = max(missing + (missing - 1), max(letterCosts[i] for i in string.ascii_lowercase if not i in n.covered))
       needed = missing + (missing - 1)
       return phrase_width(n) + needed - tie_breaker
   else:
       return palindrome_balance(n) - tie_breaker

And here's how those functions are used, along with parameters for tuning memory use (beam_width), the upper bound on cost for solutions we're willing to accept (U), a predicate to control which intermediate solutions we yield, and a way to inject a stack (which is handy for resuming cancelled searches):


def bss(nodes, expand, g, h, beam_width, U = sys.maxsize, feasibleP = lambda n: False, stack = None):
   """Beam-stack search, a variation on
   breadth-first branch and bound with
   backtracking.

   >>> [i for i in bss([], None, lambda n: 1, lambda n: 1, 10)]
   []

   >>> [i for i in bss([1], lambda n: [], lambda n: 1, lambda n: 1, 10)]
   [1]

   >>> sorted([i for i in bss(range(15), lambda n: [], lambda n: 1, lambda n: 1, 10)])
   [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

   >>> sorted([i for i in bss(range(5), intexpand, lambda n: 1, lambda n: 0, 3)])
   [1, 2]

   """
   def dedup(frontier):
       # This is a warning rather than an error:
       # some nodes really are equivalent though
       # not identical. For example, spaces don't
       # matter in pangramic palindromes.
       frontier = sorted(frontier, key = lambda p: p[0])
       groupedFrontier = (
           itertools.groupby(
               sorted(frontier,
                      key = lambda p: p[0]),
               lambda p: p[0]))
       frontier = []
       for i in groupedFrontier:
           elts = list(itertools.islice(i[1], 2))
           if (len(elts) <= 1):
               frontier.extend(elts)
           else:
               frontier.append(elts[0])
               print('f needs nuance: these nodes have the same f-cost: %d = f(%s) = f(%s)' % (elts[0][0], ' '.join(elts[0][1].phrase()), ' '.join(elts[1][1].phrase())))
       return sorted(frontier, key = lambda p: p[0])
   def prune(frontier, beam_width):
       if len(frontier) <= beam_width:
           f_max = frontier[-1][0]
           print('pruned nodes, new f_max = %s' % f_max)
           return frontier, f_max
       # We'll never backtrack to the nodes
       # we're pruning, even if these nodes
       # are never admissibly discarded.
       # beam_width may be smaller than the largest
       # equivalence class defined by f.
       # Ideally, these classes should contain
       # one node each; otherwise it's difficult
       # to choose a beam width that will avoid
       # splitting an equivalence class.
       s = dedup(frontier)
       retained, pruned = s[:beam_width], s[beam_width:]
       return prune(retained, beam_width)

   # 63: verbify ma pluck now sh ad go jo zax eta qat ex azo jog dahs wonk culpa my fib rev
   # 65: buckrams donzel ti pyx of vug we jo hm suq us mho jew guv foxy pit lez nods mark cub
   # 65: verbify na ducks to hm we jo zax al pig suq us gip lax azo jew mho tsk cud any fib rev
   # 67: bonked vulgars ti hm oy cap wo jo zax if suq us fix azo jow pac yom hits rag luv de knob
   # 69: verbify locks twang up mho zax ed ma jab suq us ba jam dex azo hm pug naw tsk coly fib rev
   # 71: verbify lock nut spaz we jig am dex eh ta qua sau qat hexed magi jew zap stunk coly fib rev
   # 76: drawknife butyls go hm op ox azo jo we vac suq us suq us cave wo jo zax op om hog sly tube fink ward
   # 86: de but way xi sh on vug la ef pa me re ka jo ah coz om ta qat ta qat mozo chao jake remap feal guv noh six yaw tubed
   # 92: drawknife butyls go hm op ox azo jo we vac suq usherettes setter eh suq us cave wo jo zax opo mho g sly tube fink ward
   # 101: underflow machs ti by xi pe ka jo zag eve suq aimu umiaq use vega zo jake pixy bits h cam wolfr ed nu

   f = lambda n: g(n) + h(n)
   if not stack:
       oldFrontier = [(f(i), i) for i in nodes]
       for i in filter(lambda x: feasibleP(x[1]), oldFrontier):
           U = min(U, g(i[1]))
           yield i[1]
       oldFrontier = [i for i in oldFrontier if i[0] < U]
       stack = [[0, U, oldFrontier]]
   j = 0
   while stack:
       f_min, f_max, oldFrontier = stack[-1]
       print('f_min, f_max, level = %s, %s, %d' % (f_min, f_max, len(stack)))
       if oldFrontier: print('sample phrase: %s' % ' '.join(oldFrontier[0][1].phrase()))
       frontier = []
       for i, o in enumerate(oldFrontier):
           visit(o[1])
           j += 1
           if not j % 32: print('nodes expanded: %d' % j)
           k = 0
           for n in expand(o[1], advisoryBudget = f_max):
               k += 1
#                if not k % 8196: print('progress on frontier: %d' % k)
               if feasibleP(n):
                   gn = g(n)
                   if gn >= U: continue
                   # g-values are integers, and f
                   # is admissible, so all the f-values
                   # above g - 1 have f-cost at least
                   # as high as g (i.e., they're no
                   # better than this solution).
                   U = gn - 1 + (1.0 / sys.maxsize)
                   f_max = U
                   yield n
                   oldFrontier[i + 1:] = filter(lambda p: p[0] < U, oldFrontier[i + 1:])
                   continue
               p = (f(n), n)
               if f_min < p[0] < f_max:
                   frontier.append(p)
                   if len(frontier) > beam_width * 2:
                       frontier, f_max = prune(frontier, beam_width)
       # filter, sort, reduce equivalence classes to class representatives
       prior_frontier = list(frontier)
       frontier = filter(lambda p: p[0] <= f_max, frontier)
       frontier = dedup(frontier)
       if not frontier:
           # backtrack
           print('empty layer -> backtracking from %d to' % len(stack), end=' ')
           while stack and stack[-1][1] >= U:
               stack.pop()
           print(len(stack), end=' ')
           if stack:
               frame = stack[-1]
               frame[0], frame[1] = frame[1], U
               frame[2] = list(filter(lambda n: n[0] < end="'')

Now, a very obvious improvement can be had by noticing that we guarantee palindromes by construction. Once we attain the pangram property, it's entirely straightforward to complete the palindrome. I call that "peeking." Here's a way to take the short-cut at the appropriate moment during bss:


class TriePair:
   # ...
   def pivotH(self, n):
       revCompletions = self.s(self.rev, n.phrase_string_ns())
       completionWords = itertools.chain(*(t.words() for t in revCompletions))
       return min(len(w) for w in completionWords) - (32 - len(n.word)) / 32 - abs(hash(n.phrase_string_ns())) / sys.maxsize / 64
   @staticmethod
   def rightH(n, expected_length):
       return expected_length - sum(len(i) for i in n.phrase()) - abs(hash(n.phrase_string_ns())) / sys.maxsize
   def rightExpand(self, n, match, len_left):
       """Restricted search space for quickly finding
       pangramic palindrome completions.
       >>> left = 'drawknife butyls go hm op ox azo jo we vac suq us'.split()
       >>> right = 'suq us cave wo jo zax op om hog sly tube fink ward'.split()
       >>> extraneous = ['opossum']
       >>> palindrome = left + right
       >>> tries = TriePair(palindrome + extraneous)
       >>> all(tries.fwd.match(i)[1] == '' for i in palindrome)
       True
       >>> n = functools.reduce(lambda a, i: WordSetNode(i, a), left, None)
       >>> match = ''.join(left)[::-1]
       >>> rx = functools.partial(tries.rightExpand, match)
       >>> x = list(rx(n))
       >>> [i.word for i in x]
       ['suq']

       >>> n = functools.reduce(lambda a, i: WordSetNode(i, a), itertools.chain(left, itertools.takewhile(lambda p: p != 'op', right)), None)
       >>> m, r = tries.fwd.match('op')
       >>> r == '' and sum(1 for i in m.words()) > 1
       True

       >>> [i.word for i in list(rx(n))]
       ['op']

       >>> n = functools.reduce(lambda a, i: WordSetNode(i, a), palindrome[:-1], None)
       >>> palindrome[-1] in (i.word for i in list(rx(n)))
       True
       """
       m = match[sum(len(i) for i in n.phrase()) - len_left:]
       r = self.fwd.match(m)[1]
       nextWordSupersequence = m[:-len(r)] if r else m
       for i in self.fwd.filter(Trie([nextWordSupersequence]), Trie()):
           yield WordSetNode(i, n)
   def completePalindrome(self, pangram, expand, f_max = sys.maxsize, match = None, left = None):
       # the pivot of the palindrome can be anywhere
       # left of the last letter needed for the pangram.
       # So it could be in the rightmost word of pangram,
       # or it could be in words yet to be added to our
       # phrase. We'll expand the pangram in the usual
       # way elsewhere, so we don't need to be complete
       # here. We do need to remain feasible, however.
       # Our strategy here is to find a pivot that
       # coincides with a word boundary, so that we can
       # use the reverse of the phrase as the sequence
       # of letters to appear on the right. This will
       # give us a much faster way to complete a
       # palindrome and establish a tighter upper bound
       # for the rest of our search.
       # TODO: also check for more interior pivots by
       # extending keystoneP to enumerate all possible
       # interior pivots rather than just the first one.
       #  ababa -> 1, 2, 3
       if match and left:
           len_match = len(match)
           len_left = len(left)
           rx = functools.partial(self.rightExpand, match = match, len_left = l
           rh = functools.partial(TriePair.rightH, expected_length = len_left + len_match)
           print('building right with match = %s and left = %s' % (match, left))
           for completion in bss([pangram], rx, g, rh, 1, len_left + len_match + 1/sys.maxsize, feasibleP):
               yield completion
           return
       if keystoneP(pangram.word):
           match = pangram.parent.phrase_string_ns() + pangram.word[:keystoneP(pangram.word)]
           for i in self.completePalindrome(pangram, expand, f_max, match = match, left = pangram.phrase_string_ns()):
               f_max = min(f_max, g(completion))
               yield i
       for left in bss([pangram], expand, g, self.pivotH, 1, f_max, lambda n: any(t.terminus for t in self.s(self.rev, n.phrase_string_ns()) + self.s(self.rev, n.phrase_string_ns()[:-1]))):
           match = left.phrase_string_ns()[::-1]
           for m in (match[1:], match):
               for completion in self.completePalindrome(left, expand, f_max, match = m, left = match):
                   yield completion

In essence, it comes down to switching to very well-informed heuristics that can exploit our foreknowledge of the letter sequence we want in the second half of the phrase.

This still tied up my laptop for many hours. I had two ideas for making progress: (1) reduce the dictionary in hopes of lowering the upper bound for the search and (2) experiment with different heuristics in hopes of dynamically finding better solutions more quickly. As an example of (1), I tried excluding words with multiple occurrences of any letter. As an example of (2), I tried using a piecewise heuristic function that penalizes letter repetition early in the search, but also recognizes that the pangram may contain a "keystone" word overlapping its palindromic midpoint.

Here's how I disallowed intraword letter duplication:


def minKeystoneP(w):
   """Determine whether this word's interior
   contains the palindromic pivot of a theoretically
   minimal-sized pangramic palindrome.
   >>> minKeystoneP('abc')
   False

   >>> minKeystoneP('aba')
   True

   >>> minKeystoneP('abaa')
   False

   """
   k = keystoneP(w)
   if not k: return False
   sides = w[:k], w[k:]
   return all(len(s) == len(set(s)) for s in sides)

def mintries(d):
   minKeystones = [w for w in d if minKeystoneP(w)]
   nodups = [w for w in d if len(set(w)) == len(w)]
   return TriePair(minKeystones + nodups)

Here's some profiling data from a run that used a mintries dictionary (stopped at the first feasible solution). I used beam_width = 512, U = 1024:

solution (65): buckrams donzel ti pyx of vug we jo hm suq us mho jew guv foxy pit lez nods mark cub
3063736602 function calls (1650738620 primitive calls) in 15627.224 CPU seconds
Ordered by: standard name
ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
1	0.000	0.000	15627.223	15627.223	<string>:1(<module>)
31125689	67.420	0.000	364.352	0.000	<string>:100(pangramP)
463188677	289.571	0.000	289.571	0.000	<string>:102(ancestors)
31119396	148.473	0.000	562.384	0.000	<string>:107(phrase)
31119396	258.985	0.000	413.911	0.000	<string>:108(<listcomp>)
31119402	137.097	0.000	896.415	0.000	<string>:109(phrase_string_ns)
183/78	0.001	0.000	0.001	0.000	<string>:121(__init__)
285653338	160.594	0.000	160.594	0.000	<string>:129(emptyP)
-495413458/-781144130	3292.676	-0.000	3292.676	-0.000	<string>:131(match)
144/39	0.001	0.000	0.001	0.000	<string>:155(add)
48/8	0.000	0.000	0.000	0.000	<string>:176(words)
1230988464/134840576	8569.205	0.000	12100.459	0.000	<string>:188(_filter_r)
134840576	127.894	0.000	12228.353	0.000	<string>:199(filter)
1910	0.005	0.000	0.043	0.000	<string>:25(palindromeP)
6248	0.159	0.000	0.931	0.000	<string>:260(s)
156545	0.201	0.000	0.528	0.000	<string>:269(<genexpr>)
25939	0.181	0.000	0.709	0.000	<string>:269(<listcomp>)
2	0.000	0.000	0.001	0.001	<string>:303(pivotH)
4	0.000	0.000	0.000	0.000	<string>:305(<genexpr>)
8	0.000	0.000	0.000	0.000	<string>:306(<genexpr>)
51	0.001	0.000	0.007	0.000	<string>:307(rightH)
872	0.001	0.000	0.001	0.000	<string>:309(<genexpr>)
87	0.001	0.000	0.009	0.000	<string>:310(rightExpand)
633	0.001	0.000	0.001	0.000	<string>:337(<genexpr>)
8/3	0.000	0.000	0.391	0.130	<string>:343(completePalindrome)
1910	0.013	0.000	0.016	0.000	<string>:37(<lambda>)
6	0.000	0.000	0.000	0.000	<string>:377(<genexpr>)
2	0.000	0.000	0.002	0.001	<string>:377(<lambda>)
6262	0.049	0.000	1.024	0.000	<string>:383(setupContinuation)
31125244	430.604	0.000	13368.363	0.000	<string>:397(expand)
21700	0.037	0.000	0.037	0.000	<string>:404(<genexpr>)
216034002	221.931	0.000	356.529	0.000	<string>:406(<genexpr>)
6301	0.075	0.000	0.168	0.000	<string>:416(visit)
6301	0.041	0.000	0.078	0.000	<string>:419(<listcomp>)
10	0.000	0.000	0.000	0.000	<string>:424(<listcomp>)
93363173/62244144	198.180	0.000	322.037	0.000	<string>:440(g)
179	0.002	0.000	0.008	0.000	<string>:450(partition)
179	0.004	0.000	0.036	0.000	<string>:458(palindrome_balance)
31118982	306.394	0.000	1302.780	0.000	<string>:521(h)
31119035	48.462	0.000	412.754	0.000	<string>:563(feasibleP)
11/2	203.033	18.458	15627.223	7813.611	<string>:566(bss)
256/128	0.195	0.001	0.298	0.002	<string>:580(prune)
125068	0.044	0.000	0.044	0.000	<string>:585(<lambda>)
31125689	75.291	0.000	286.533	0.000	<string>:60(pangramP)
31119035	71.754	0.000	1642.539	0.000	<string>:615(<lambda>)
7	0.000	0.000	0.002	0.000	<string>:617(<listcomp>)
7	0.000	0.000	0.002	0.000	<string>:618(<lambda>)
7	0.000	0.000	0.000	0.000	<string>:621(<listcomp>)
660	0.000	0.000	0.000	0.000	<string>:648(<lambda>)
7511	0.004	0.000	0.004	0.000	<string>:657(<lambda>)
6931	0.002	0.000	0.002	0.000	<string>:658(<lambda>)
6931	0.002	0.000	0.002	0.000	<string>:659(<lambda>)
45	0.012	0.000	0.016	0.000	<string>:659(<listcomp>)
641	0.000	0.000	0.000	0.000	<string>:669(<lambda>)
1	0.000	0.000	15627.223	15627.223	<string>:677(profileDriver)
2	0.000	0.000	0.000	0.000	<string>:682(keystoneP)
31119030	149.929	0.000	351.339	0.000	<string>:90(__init__)
4	0.000	0.000	0.000	0.000	_abcoll.py:122(__subclasshook__)
2	0.000	0.000	0.000	0.000	_weakrefset.py:10(__init__)
8	0.000	0.000	0.000	0.000	_weakrefset.py:20(__iter__)
31131950	48.803	0.000	48.803	0.000	_weakrefset.py:29(__contains__)
4	0.000	0.000	0.000	0.000	_weakrefset.py:36(add)
31131937	75.111	0.000	123.914	0.000	abc.py:118(__instancecheck__)
4/3	0.000	0.000	0.000	0.000	abc.py:134(__subclasscheck__)
942	0.002	0.000	0.006	0.000	ascii.py:21(encode)
1	0.000	0.000	0.000	0.000	codecs.py:164(__init__)
1	0.000	0.000	0.000	0.000	codecs.py:937(getincrementalencoder)
942	0.010	0.000	0.016	0.000	io.py:1031(write)
447	0.002	0.000	0.364	0.001	io.py:1069(flush)
447	0.004	0.000	0.361	0.001	io.py:1073(_flush_unlocked)
447	0.001	0.000	0.365	0.001	io.py:1454(flush)
942	0.003	0.000	0.004	0.000	io.py:1465(closed)
942	0.012	0.000	0.404	0.000	io.py:1479(write)
1	0.000	0.000	0.000	0.000	io.py:1500(_get_encoder)
2331	0.002	0.000	0.002	0.000	io.py:755(closed)
6301	0.005	0.000	0.005	0.000	{built-in method abs}
2	0.000	0.000	0.000	0.000	{built-in method any}
942	0.004	0.000	0.004	0.000	{built-in method ascii_encode}
1910	0.001	0.000	0.001	0.000	{built-in method ceil}
1	0.001	0.001	15627.224	15627.224	{built-in method exec}
31119030	69.197	0.000	69.197	0.000	{built-in method fromkeys}
4	0.000	0.000	0.000	0.000	{built-in method getattr}
155614108	204.289	0.000	204.289	0.000	{built-in method hasattr}
31118982	17.315	0.000	17.315	0.000	{built-in method hash}
53	0.000	0.000	0.000	0.000	{built-in method id}
31133821	76.905	0.000	200.819	0.000	{built-in method isinstance}
5	0.000	0.000	0.000	0.000	{built-in method issubclass}
155695466	50.374	0.000	50.374	0.000	{built-in method len}
1	0.000	0.000	0.000	0.000	{built-in method lookup}
179	0.000	0.000	0.000	0.000	{built-in method max}
4	0.000	0.000	0.000	0.000	{built-in method min}
6931	0.002	0.000	0.002	0.000	{built-in method next}
447	0.005	0.000	0.408	0.001	{built-in method print}
90	0.001	0.000	0.003	0.000	{built-in method sum}
1389	0.001	0.000	0.001	0.000	{method '__enter__' of '_thread.lock' objects}
2	0.000	0.000	0.000	0.000	{method '__subclasses__' of 'type' objects}
4	0.000	0.000	0.000	0.000	{method 'add' of 'set' objects}
82757	0.034	0.000	0.034	0.000	{method 'append' of 'list' objects}
1	0.000	0.000	0.000	0.000	{method 'disable' of '_lsprof.Profiler' objects}
942	0.001	0.000	0.001	0.000	{method 'extend' of 'bytearray' objects}
263060106	78.312	0.000	78.312	0.000	{method 'items' of 'dict' objects}
31120994	126.532	0.000	126.532	0.000	{method 'join' of 'str' objects}
31131990	10.402	0.000	10.402	0.000	{method 'keys' of 'dict' objects}
31	0.000	0.000	0.000	0.000	{method 'pop' of 'list' objects}
31119029	111.008	0.000	111.008	0.000	{method 'update' of 'dict' objects}
447	0.357	0.001	0.357	0.001	{method 'write' of '_FileIO' objects}

Using the same parameters but substituting a piecewise expand function to reduce the search space, I produced the same solution in 12949.838 CPU seconds (shortly after the 7232nd node expansion):


# this function takes a TriePair to an expand function
# suitable for use with bss:
def budgetExpand(ordinarytries):
   def bex(n, advisoryBudget):
       if n.pangramP():
           for x in ordinarytries.expand(n):
               yield x
           return
       letterAllowance = int((advisoryBudget / 2) - g(n))
       if letterAllowance < 5:
           missing = len(string.ascii_lowercase) - len(n.covered)
           if missing < letterAllowance:
               # we might be able to afford to waste the next word
               for x in ordinarytries.expand(n):
                   yield x
               return
           if missing == 2:
               # find a suitable tp
               for lp, t in raretries:
                   if not any(l in n.covered for l in lp):
                       for x in t.expand(n):
                           yield x
                       return
           if missing <= 4:
               letters = string.ascii_lowercase - n.covered.keys()
               if all(l in lettertries for l in letters):
                   for x in itertools.chain.from_iterable(t.expand(n) for l, t in lettertries.items() if not n.coveredP(l)):
                       yield x
                   return
       for x in ordinarytries.expand(n):
           yield x
   return bex

2481633428 function calls (1172623990 primitive calls) in 12949.838 CPU seconds
Ordered by: standard name
ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
1	0.000	0.000	12949.838	12949.838	<string>:1(<module>)
28221664	56.848	0.000	307.680	0.000	<string>:100(pangramP)
396097158	231.455	0.000	231.455	0.000	<string>:102(ancestors)
28206562	130.097	0.000	464.824	0.000	<string>:107(phrase)
28206562	210.286	0.000	334.727	0.000	<string>:108(<listcomp>)
28207750	120.352	0.000	750.641	0.000	<string>:109(phrase_string_ns)
75/32	0.000	0.000	0.001	0.000	<string>:121(__init__)
265064474	140.758	0.000	140.758	0.000	<string>:129(emptyP)
-776844655/-1041987795	2885.248	-0.000	2885.248	-0.000	<string>:131(match)
59/16	0.000	0.000	0.000	0.000	<string>:155(add)
24/4	0.000	0.000	0.000	0.000	<string>:176(words)
1140618203/124958593	6869.411	0.000	9963.410	0.000	<string>:188(_filter_r)
124958593	113.443	0.000	10076.852	0.000	<string>:199(filter)
394	0.001	0.000	0.145	0.000	<string>:25(palindromeP)
7260	0.162	0.000	0.953	0.000	<string>:260(s)
176364	0.208	0.000	0.545	0.000	<string>:269(<genexpr>)
28381	0.188	0.000	0.733	0.000	<string>:269(<listcomp>)
1	0.000	0.000	0.000	0.000	<string>:303(pivotH)
2	0.000	0.000	0.000	0.000	<string>:305(<genexpr>)
4	0.000	0.000	0.000	0.000	<string>:306(<genexpr>)
21	0.000	0.000	0.002	0.000	<string>:307(rightH)
350	0.000	0.000	0.000	0.000	<string>:309(<genexpr>)
36	0.000	0.000	0.003	0.000	<string>:310(rightExpand)
251	0.000	0.000	0.000	0.000	<string>:337(<genexpr>)
5/2	0.000	0.000	0.233	0.117	<string>:343(completePalindrome)
394	0.002	0.000	0.142	0.000	<string>:37(<lambda>)
3	0.000	0.000	0.000	0.000	<string>:377(<genexpr>)
1	0.000	0.000	0.001	0.001	<string>:377(<lambda>)
9180	0.052	0.000	1.037	0.000	<string>:383(setupContinuation)
28215608	356.822	0.000	11018.065	0.000	<string>:397(expand)
28840	0.049	0.000	0.049	0.000	<string>:404(<genexpr>)
183935843	180.895	0.000	287.857	0.000	<string>:406(<genexpr>)
7287	0.070	0.000	0.174	0.000	<string>:416(visit)
7287	0.045	0.000	0.089	0.000	<string>:419(<listcomp>)
10	0.000	0.000	0.000	0.000	<string>:424(<listcomp>)
84626570/56420123	168.273	0.000	271.354	0.000	<string>:440(g)
29	0.000	0.000	0.002	0.000	<string>:450(partition)
29	0.001	0.000	0.008	0.000	<string>:458(palindrome_balance)
28206428	257.713	0.000	1093.465	0.000	<string>:521(h)
28206451	41.020	0.000	348.655	0.000	<string>:563(feasibleP)
7/2	170.547	24.364	12949.838	6474.919	<string>:566(bss)
252/126	0.193	0.001	0.258	0.002	<string>:580(prune)
127106	0.048	0.000	0.048	0.000	<string>:585(<lambda>)
28221664	63.621	0.000	242.118	0.000	<string>:60(pangramP)
28206450	63.101	0.000	1381.873	0.000	<string>:616(<lambda>)
4	0.000	0.000	0.001	0.000	<string>:618(<listcomp>)
4	0.000	0.000	0.001	0.000	<string>:619(<lambda>)
4	0.000	0.000	0.000	0.000	<string>:622(<listcomp>)
497	0.000	0.000	0.000	0.000	<string>:649(<lambda>)
8533	0.004	0.000	0.004	0.000	<string>:658(<lambda>)
7784	0.002	0.000	0.002	0.000	<string>:659(<lambda>)
7784	0.004	0.000	0.004	0.000	<string>:660(<lambda>)
759	0.000	0.000	0.000	0.000	<string>:670(<lambda>)
1	0.000	0.000	0.000	0.000	<string>:683(keystoneP)
28213699	29.299	0.000	11048.220	0.000	<string>:826(bex)
35164	0.018	0.000	0.018	0.000	<string>:844(<genexpr>)
3779	0.003	0.000	0.003	0.000	<string>:852(<genexpr>)
3731	0.024	0.000	0.032	0.000	<string>:854(<genexpr>)
28206448	126.205	0.000	295.114	0.000	<string>:90(__init__)
14576	0.008	0.000	0.008	0.000	<string>:98(coveredP)
4	0.000	0.000	0.000	0.000	_abcoll.py:122(__subclasshook__)
2	0.000	0.000	0.000	0.000	_weakrefset.py:10(__init__)
8	0.000	0.000	0.000	0.000	_weakrefset.py:20(__iter__)
28228934	40.811	0.000	40.811	0.000	_weakrefset.py:29(__contains__)
4	0.000	0.000	0.000	0.000	_weakrefset.py:36(add)
28228924	63.055	0.000	103.866	0.000	abc.py:118(__instancecheck__)
4/3	0.000	0.000	0.000	0.000	abc.py:134(__subclasscheck__)
6399	0.014	0.000	0.037	0.000	ascii.py:21(encode)
1	0.000	0.000	0.000	0.000	codecs.py:164(__init__)
1	0.000	0.000	0.000	0.000	codecs.py:937(getincrementalencoder)
6399	0.066	0.000	0.103	0.000	io.py:1031(write)
3190	0.014	0.000	0.670	0.000	io.py:1069(flush)
3190	0.026	0.000	0.655	0.000	io.py:1073(_flush_unlocked)
3190	0.009	0.000	0.679	0.000	io.py:1454(flush)
6399	0.013	0.000	0.021	0.000	io.py:1465(closed)
6399	0.075	0.000	0.919	0.000	io.py:1479(write)
1	0.000	0.000	0.000	0.000	io.py:1500(_get_encoder)
15988	0.014	0.000	0.014	0.000	io.py:755(closed)
7287	0.005	0.000	0.005	0.000	{built-in method abs}
923	0.003	0.000	0.005	0.000	{built-in method all}
15357	0.013	0.000	0.024	0.000	{built-in method any}
6399	0.023	0.000	0.023	0.000	{built-in method ascii_encode}
394	0.139	0.000	0.139	0.000	{built-in method ceil}
1	0.000	0.000	12949.838	12949.838	{built-in method exec}
911	0.001	0.000	0.001	0.000	{built-in method from_iterable}
28206448	58.517	0.000	58.517	0.000	{built-in method fromkeys}
4	0.000	0.000	0.000	0.000	{built-in method getattr}
141057204	171.326	0.000	171.326	0.000	{built-in method hasattr}
28206428	14.606	0.000	14.606	0.000	{built-in method hash}
22	0.000	0.000	0.000	0.000	{built-in method id}
28241722	65.730	0.000	169.595	0.000	{built-in method isinstance}
5	0.000	0.000	0.000	0.000	{built-in method issubclass}
141174090	42.904	0.000	42.904	0.000	{built-in method len}
1	0.000	0.000	0.000	0.000	{built-in method lookup}
29	0.000	0.000	0.000	0.000	{built-in method max}
2	0.000	0.000	0.000	0.000	{built-in method min}
7766	0.002	0.000	0.002	0.000	{built-in method next}
3190	0.031	0.000	0.949	0.000	{built-in method print}
37	0.000	0.000	0.001	0.000	{built-in method sum}
9589	0.009	0.000	0.009	0.000	{method '__enter__' of '_thread.lock' objects}
2	0.000	0.000	0.000	0.000	{method '__subclasses__' of 'type' objects}
4	0.000	0.000	0.000	0.000	{method 'add' of 'set' objects}
85797	0.034	0.000	0.034	0.000	{method 'append' of 'list' objects}
1	0.000	0.000	0.000	0.000	{method 'disable' of '_lsprof.Profiler' objects}
6399	0.008	0.000	0.008	0.000	{method 'extend' of 'bytearray' objects}
244196206	68.331	0.000	68.331	0.000	{method 'items' of 'dict' objects}
28206877	105.938	0.000	105.938	0.000	{method 'join' of 'str' objects}
28229874	8.717	0.000	8.717	0.000	{method 'keys' of 'dict' objects}
15	0.000	0.000	0.000	0.000	{method 'pop' of 'list' objects}
28206447	92.287	0.000	92.287	0.000	{method 'update' of 'dict' objects}
3190	0.626	0.000	0.626	0.000	{method 'write' of '_FileIO' objects}

What about that penultimate case, searching up to four overlapping TriePairs each representing the set of words containing a missing letter? How much does it help? The mintriesd superset has 35103 words, and a sample of four-combinations range in size from 11263 to 24428 words. But, the average size lettertrie has 4784 words, so we get a noticeable speedup even though trie search has logarithmic time complexity. Without the lettertries case, we need 2656858411 function calls (1319973770 primitive calls) in 14205.551 CPU seconds.

Finally, I added to the budgetExpand repertoire with lengthtries.

The idea for lengthtries comes from the observation that it would be useful to limit the depth of trie traversals as we approach the current cost bound. With this technique I brought the first solution discovery down to 398442028 function calls in 9485.551 CPU seconds:


def budgetExpand(ordinarytries):
    raretries = [(lp, TriePair([w for w in ordinarytries.fwd.words() if all(l in w for l in lp)])) for lp in 'jm vx jy fx jk gq mq jp qy bq hq pq kz jv wz hj fq qv xz kq jw fj qz wx jz kx qw jx qx jq'.split()]
    hist = Histogram(ordinarytries.fwd.words())
    lettertries = {l : TriePair(hist[l]) for l in [k for k, v in sorted(hist.h.items(), key = lambda i: len(i[1]))][:16]}
    lengthtries = {i : TriePair((w for w in mintriesd.fwd.words() if len(w) <= i or keystoneP(w)), mintriesd.rev.words()) for i in range(1, 8)}
    def bex(n, advisoryBudget):
        if n.pangramP():
            for x in ordinarytries.expand(n):
                yield x
            return
        letterAllowance = int((advisoryBudget / 2) - g(n))
        if letterAllowance < 5:
            missing = len(string.ascii_lowercase) - len(n.covered)
            if missing < letterAllowance:
                # we might be able to afford to waste the next word
                for x in lengthtries[letterAllowance].expand(n):
                    yield x
                return
            if missing == 2:
                # find a suitable tp
                for lp, t in raretries:
                    if not any(l in n.covered for l in lp):
                        for x in t.expand(n):
                            yield x
                        return
            if missing <= 4:
                letters = string.ascii_lowercase - n.covered.keys()
                if all(l in lettertries for l in letters):
                    for x in itertools.chain.from_iterable(t.expand(n) for l, t in lettertries.items() if not n.coveredP(l)):
                        yield x
                    return
        elif letterAllowance in lengthtries:
            for x in lengthtries[letterAllowance].expand(n):
                yield x
            return
        for x in ordinarytries.expand(n):
            yield x
    return bex

One last idea for search space reduction is to exclude words containing covered letters:


def budgetExpand(ordinarytries):
   # least-common 2-combinations of letters in d:
   # [('kx', 53), ('qw', 27), ('jx', 27), ('qx', 26), ('jq', 8)]
   raretries = [(lp, TriePair((w for w in ordinarytries.fwd.words() if all(l in w for l in lp)), ordinarytries.rev)) for lp in 'jm vx jy fx jk gq mq jp qy bq hq pq kz jv wz hj fq qv xz kq jw fj qz wx jz kx qw jx qx jq'.split()]
   hist = Histogram(ordinarytries.fwd.words())
   lettertries = {l : TriePair(hist[l], ordinarytries.rev) for l in [k for k, v in sorted(hist.h.items(), key = lambda i: len(i[1]))][:16]}
   lengthtries = {i : TriePair((w for w in ordinarytries.fwd.words() if len(w) <= i or keystoneP(w)), ordinarytries.rev) for i in range(1, 8)}
   def mfl(k, n):
       """Kth most frequent letter k, n -> (#, a)"""
       if not hasattr(n, 'hist'):
           n.hist = Histogram(n.tp.fwd.words())
       lf = list(itertools.islice(n.hist, k + 1))[-1]
       return (sum(1 for i in lf[1]), lf[0])
   exttltries = Trie()
   exttltries.tp = mintriesd
   # this isn't optimal; there may be unreachable
   # nodes and some of those may be more selective
   # than letter-equivalent reachable nodes. But
   # it won't yield incorrect results and it's
   # an approximation I can build quickly.
   for i in range(32):
       (freq, split_char), parent = max(
           (mfl(len(n.children), n), n) for n in
           exttltries.pre_order_nodes() if len(n.children) <= 4)
       parent.add(split_char)
       subtrie = parent.search(split_char)
       subtrie.tp = TriePair(Trie(w for w in parent.tp.fwd.words() if split_char not in w), parent.tp.rev)
   print([x for x in exttltries.pre_order_nodes()])
   print([sum(1 for y in x.tp.fwd.words()) for x in exttltries.pre_order_nodes()])
   def pickTP(t, n):
       if (len(t.children) == 0): return t.tp
       for l in t.children:
           if not l in n.covered: continue
           return pickTP(t.children[l], n)
       return t.tp
   def bex(n, advisoryBudget):
       if n.pangramP():
           for x in ordinarytries.expand(n):
               yield x
           return
       letterAllowance = int((advisoryBudget / 2) - g(n))
       if letterAllowance < 5:
           missing = len(string.ascii_lowercase) - len(n.covered)
           if missing < letterAllowance:
               # we might be able to afford to waste the next word
               for x in lengthtries[letterAllowance].expand(n):
                   yield x
               return
           if missing == 2:
               # find a suitable tp
               for lp, t in raretries:
                   if not any(l in n.covered for l in lp):
                       for x in t.expand(n):
                           yield x
                       return
           if missing <= 4:
               letters = string.ascii_lowercase - n.covered.keys()
               if all(l in lettertries for l in letters):
                   for x in itertools.chain.from_iterable(t.expand(n) for l, t in lettertries.items() if not n.coveredP(l)):
                       yield x
                   return
       if letterAllowance in lengthtries:
           for x in lengthtries[letterAllowance].expand(n):
               yield x
           return
       for x in pickTP(exttltries, n).expand(n):
           yield x
   return bex

There are unreachable TriePairs in exttltries, and some of those are smaller than letter-equivalent reachable ones. To avoid this, we should maintain the exttltries property that no node's letter should appear in the set of siblings of its ancestors. Together with the greedy order for splitting nodes, this minimizes size and optimizes expected search cost. But we do pretty well even with the quick and dirty experiment: we find a first solution (cost 67) before the 7808 node expansion in under 4562.899 CPU seconds. A 65 solution is found before the 9984 node expansion in under 5141.208 CPU seconds.

I never got around to rewriting variations in search of more beautiful code. I still wonder what other tricks I've overlooked for improving the heuristic function or reducing the search space.

Rumor has it that Common Lisp has been really well-optimized over the years, so I wonder how much faster this could go if rewritten in lisp. Also, I could go twice as fast on my dual-core CPU if python had threads (just partition the input) or if I quit some other apps in order to fit two python instances of this program in my 4GB of RAM.

Many have posted solutions for this problem. Solutions I can recommend reading if this post has piqued your interest include Jim Vellenga's and Keoki Zee's.

Sean Foy

Wednesday, February 8, 2012

Book Review: The Name of the Rose

Saturday, December 31, 2011

Book Review: The House of Wittgenstein

Wednesday, November 30, 2011

Book Review: The Annotated Turing

Monday, October 31, 2011

Book Review: The Power Broker: Robert Moses and the Fall of New York

Sunday, January 2, 2011

Book Review: The Time Traveler's Guide to Medieval England

Saturday, January 1, 2011

Book Review: The Ghost Map

Thursday, April 22, 2010

Palindromic Pangrams

Blog Archive

Followers

feeds I follow