Thursday, June 20, 2013

But wait, Spot-It! has only 55 cards.

In a previous post and its follow-up I described the process by which I decided that Spot-It! likely used 57 symbols and that it could have up to 57 cards in its deck. But as mentioned earlier, the deck actually has only 55 cards. Did they just leave two cards out? Do they have more symbols, or fewer?

Yes, they just left two cards out; no, they use exactly 57 symbols. To figure this out, I created a file "cards.txt", with the name of each symbol on a single line. The symbols on a given card are grouped together in an eight-line stanza; an empty line separates stanzas.

yin-yang
paint
tree
lightning
zebra
clef
ladybug
Canada

bomb
dragon
carrot
paint
ghost
question
bang
clown

spider
…etc.
The excerpt at right signifies that:
  • one card contains the yin/yang sign, the paint dots, tree, lightning bolt, zebra, (treble) clef, ladybug, maple leaf;
  • another card has the bomb, dragon, carrot, paint, ghost, question mark, exclamation point…
you get the idea.

Alert readers may notice that "cards.txt" has names consisting of only one "word" (by which I mean a contiguous sequence of non-whitespace characters) each, and that some of them don't exctly match my summary. The one-"word" thing makes them easier for shell scripts to process, and I translate these internal names into less-cryptic (I hope) things like "maple leaf" (Canada) or "exclamation point" (bang).

The format of "cards.txt" made for easy error checking, which as it turned out was necessary. (If you're thinking at this point, "How anal!" all I can say is "guilty as charged, yer honor.") It also made for some easy analysis. For example, we learn that although most of the symbols appear on 8 cards each, some don't:

% grep -v '^$' cards.txt | sort | uniq -c | sort -n | head -n16
   6 snowman
   7 Canada
   7 bang
   7 cactus
   7 daisy
   7 dinosaur
   7 dog
   7 eye
   7 ice
   7 ladybug
   7 light-bulb
   7 man
   7 question
   7 skull
   7 stop
   8 anchor
% grep -v '^$' cards.txt | sort | uniq -c | sort -n | tail -n3
   8 web
   8 yin-yang
   8 zebra
% 
The (American) English translation of the above is: only six cards have a snowman. Fourteen symbols (maple leaf, exclamation point, cactus, daisy, etc.) appear on only 7 cards each; the rest appear on 8 cards each. This suggests that two cards could be added to the deck. Each card would have a snowman and 7 of the other symbols. Which symbols? Well, any symbol that would appear with the maple leaf (aka "Canada") on a new card must not already appear with the maple leaf on another card. How to decide that?

It takes a few steps. First, a shell script converts "cards.txt" into another file, "cards-sorted.txt":

% L1=1; while [[ $L1 -lt 490 ]] ; do 
    ((L2=L1+7)); sed -n $L1,${L2}p cards.txt | sort | fmt -w222; 
    ((L1=L1+9)); done | 
    sort > cards-sorted.txt
% 
This takes, e.g., lines 1-8 of "cards.txt", sorts them, and combines them into a single line. Then it takes lines 10-17 (that's eight lines) and does the same with them. Then lines 19-26, and so on.

Each line in "cards-sorted.txt" thus corresponds to a single card, and contains the card's one-word symbol names in alphabetical order. Consequently, cards-sorted.txt starts off like this:

% head cards-sorted.txt 
Canada anchor carrot cheese clock knight stop web
Canada apple bang igloo moon scissors snowflake spider
Canada art balloon bomb drop fire lips skull
Canada bottle candle ghost light-bulb lock pencil sunglasses
Canada car cat clover clown dog ok sun
Canada clef ladybug lightning paint tree yin-yang zebra
Canada dolphin dragon eye hand heart key target
anchor apple art dinosaur dolphin ghost ladybug ok
anchor balloon clown lightning snowman spider sunglasses target
anchor bang car clef daisy key lips lock
% 
Now I'll use "cards-sorted.txt" to see which of those fourteen symbols appear together with "Canada" (maple leaf) on any card:
% f() { grep Canada.*$1 cards-sorted.txt; }
% C=; D=; grep -v '^$' cards.txt |sort | uniq -c | sort -n | grep 7 | 
    while read A B; do 
       echo === $B ===
       if f $B; then C="$C $B"; else D="$D $B"; fi 
       echo C=$C
       echo D=$D
    done
… [much output deleted]
C= bang dog eye ladybug light-bulb skull stop
D= Canada cactus daisy dinosaur ice man question
% 
The "C=" symbols already appear together with "Canada" (maple leaf), so we won't put them on the same card as a maple leaf again. Therefore, I think that the following cards, if added, would "complete" a set of 57 cards:
  • snowman, exclamation point, dog, eye, ladybug, light bulb, skull, STOP!
  • snowman, maple leaf, cactus, daisy, dinosaur, ice cube, man, question mark
I wasn't quite confident in this, so I wrote this brief shell script to check it.
% cat checkit.sh 
#!/bin/sh
# Ensure that for any pair of symbols within $C [or $D], no single card 
# contains both members of the pair.

C="bang dog eye ladybug light-bulb skull stop"
D="Canada cactus daisy dinosaur ice man question"

all_combos() {
    while [[ $# -ge 2 ]] ; do
        first=$1
        shift
        for it in $*; do
            echo checking $first $it ...
            if grep $first cards-sorted.txt | grep $it; then 
                echo ERROR $first $it 
            fi
        done
    done
}

all_combos $C
all_combos $D
% 
A visual inspection of the output revealed that we were indeed checking for "bang" and "dog" together, then "bang" and "eye"... and so on. It did all that without ever printing "ERROR", so I think the list is good.

That sure was fun! But let me check one more time. I'll add the above two cards to the list of sorted cards:

% diff cards-sorted.txt  cards-complete.txt 
55a56,57
> snowman bang dog eye ladybug light-bulb skull stop
> snowman Canada cactus daisy dinosaur ice man question
% 
Then let me ensure that given any pair of symbols on a given card, that that card is the only one containing that pair:
% cat check-complete.sh 
#!/bin/sh
# based on checkit.sh -- for every card in the 
# hypothetical "complete" Spot-It! deck (in "cards-complete.txt"),
# extract every pair of symbols. If both members of the pair appear
# on any other card in the complete deck, then print "ERROR"...
TEMPFILE=/tmp/spot-tmp.$$
all_combos() {
    while [[ $# -ge 2 ]] ; do
        first=$1
        shift
        for it in $*; do
            echo checking $first $it ...
            if grep " $first " $TEMPFILE | grep " $it "; then 
                echo ERROR $first $it 
            fi
        done
    done
}
C=cards-complete.txt 
cat $C | while read X; do
    grep -v "$X" $C | sed -e 's/^/ /' -e 's/$/ /' > $TEMPFILE
    all_combos $X
done
rm -f /tmp/spot-tmp*
% 
The above script, "check-complete.sh", completed without ever saying "ERROR".
% ./check-complete.sh > x
% grep -m5 ERROR x
ERROR Canada igloo
ERROR igloo question
ERROR igloo man
ERROR ice igloo
ERROR cactus igloo
% 
To make sure that it actually works, though, I changed the last card to say "igloo Canada…" rather than the correct "snowman Canada…". The script did in fact catch the error, as you can see at right.

Am I satisfied now? I refer you to the "anal" comment above. (In other words, No.) One more thing: the above check-complete.sh verified that we didn't have any PAIR of symbols in common between any pair of cards. What it didn't do is verify that there was in fact any symbol in common between any pair of cards. That's done by this script:

% cat overlaps.sh
#!/bin/sh
# Confirm that at least one symbol overlaps every card vs every other card.
TEMPFILE=/tmp/spot-tmp.$$

check() {
    # Given two cards ($1, $2), ensure exactly one word is in common.
    if [[ $# -ne 2 ]] ; then
        echo "check() called with wrong # of args"
        exit 1;
    fi
    ACARD="$1"
    BCARD="$2"
    NUM=0
    for asym in $ACARD; do
        for bsym in $BCARD; do
            if [[ $asym == $bsym ]] ; then
                ((NUM=NUM+1))
            fi
        done
    done
    if [[ $NUM -ne 1 ]] ; then
        echo ERROR: card1=$ACARD
        echo ERROR: card2=$BCARD
    fi
}

verify_one() {
    # Handle one card's syms (args) vs the rest of the deck ($TEMPFILE)
    ONE="$*"
    cat $TEMPFILE | while read IT; do
        echo checking $ONE ==vs== $IT
        check "$ONE" "$IT"
    done
}
C=cards-complete.txt 
cat $C | while read X; do
    grep -v "$X" $C | sed -e 's/^/ /' -e 's/$/ /' > $TEMPFILE
    verify_one $X
done
rm -f /tmp/spot-tmp*
% 
This completed without error, but just to make sure, I modified the last card to use nonexistent symbol "snowman2" and re-ran it, yielding:
% ./overlaps.sh > x; grep -m6 ERROR x
ERROR: card1=anchor balloon clown lightning snowman spider sunglasses target
ERROR: card2=snowman2 Canada cactus daisy dinosaur ice man question
ERROR: card1=apple bomb cat hand lock snowman tree web
ERROR: card2=snowman2 Canada cactus daisy dinosaur ice man question
ERROR: card1=art candle carrot key moon snowman sun yin-yang
ERROR: card2=snowman2 Canada cactus daisy dinosaur ice man question
% 
Of course, each pair of cards shown (there are lots more) should have the "snowman" in common. By tweaking the last card, I removed that sharing. So the script works, and the hypothetical deck is correct.

No comments: