# SICP section 2.3 - symbolic data

## Exercise 2.55

As mentioned in the footnotes, ' is syntactic sugar for quote. Therefore, the expression (car '' abracadabra) is equivalent to (car '(quote abracadabra)).

'(quote abracadabra) is the list containing the two symbols quote and abracadabra, which makes the car of the list the symbol quote.

## Exercise 2.56

(define (exponentiation? x)
(and (pair? x) (eq? (car x) '**)))

(define (base x) (cadr x))

(define (exponent x) (caddr x))

(define (make-exponentiation x y)
(cond ((=number? y 0) 1)
((=number? y 1) x)
(else (list '** x y))))


## Exercise 2.57

Per the question, the addend of a sum is the first term (this is already the case, so it remains unchanged), but augend needs to be modified so it is the sum of the rest of the terms:

(define (augend s)
(accumulate make-sum 0 (cddr s)))


Same deal with multiplicand:

(define (multiplicand p)
(accumulate make-product 1 (cddr p)))


## Exercise 2.58

Part (a): as the question states in its preamble, the differentiation program is defined in terms of abstract data, so modifying the program to use infix notation is just a matter of changing the predicates, selectors and constructors that define the representation of the algebraic expressions:

(define (make-sum a1 a2)
(cond ((=number? a1 0) a2)
((=number? a2 0) a1)
((and (number? a1) (number? a2)) (+ a1 a2))
(else (list a1 '+ a2))))

(define (sum? x)
(and (pair? x) (eq? (cadr x) '+)))

(define (addend s) (car s))

(define (augend s) (caddr s))

(define (make-product m1 m2)
(cond ((or (=number? m1 0) (=number? m2 0)) 0)
((=number? m1 1) m2)
((=number? m2 1) m1)
((and (number? m1) (number? m2)) (* m1 m2))
(else (list m1 '* m2))))

(define (product? x)
(and (pair? x) (eq? (cadr x) '*)))

(define (multiplier p) (car p))

(define (multiplicand p) (caddr p))


Part (b): this part was much tougher. Instead of designing different predicates, selectors and constructors to handle standard algebraic notation, I took the approach of implementing an incomplete, modified version of the shunting-yard algorithm that reassembles algebraic expressions with the parentheses necessary for our derivative program to work:

(define (op? token)
(or (eq? token '+) (eq? token '*)))

(define (precedence op)
(cond ((eq? op '+) 1)
((eq? op '*) 2)
(else (error "(precedence) unknown operator"))))

(define (>precedence? o1 o2)
(> (precedence o1) (precedence o2)))

(define (apply-op output op)
(let ((lhs (cadr output))
(rhs (car output)))
(cons (cond ((eq? op '+) (make-sum lhs rhs))
((eq? op '*) (make-product lhs rhs))
(else (error "(apply-op) unknown operator")))
(cddr output))))

(define (shunting-yard exp)
(define (iter output operators exp)
(define (push-operator op)
(iter output (cons op operators) (cdr exp)))
(define (pop-operator)
(iter (apply-op output (car operators)) (cdr operators) exp))
(define (push-operand out)
(iter (cons out output) operators (cdr exp)))
(define (push-sublist sub)
(iter (cons (shunting-yard sub) output) operators (cdr exp)))
(if (null? exp)
(if (null? operators)
(car output)
(pop-operator))
(let ((token (car exp)))
(cond ((list? token) (push-sublist token))
((op? token)
(if (or (null? operators)
(>precedence? token (car operators)))
(push-operator token)
(pop-operator)))
(else (push-operand token))))))
(iter '() '() exp))

(define p1 (shunting-yard '(x + 3 * (x + y + 2))))
(define p2 (shunting-yard '(x * 5 + 3 * (x + y + 2))))
(define p3 (shunting-yard (make-sum p1 p2)))

(display (deriv p1 'x))
(newline)
(display (deriv p2 'x))
(newline)
(display (deriv p3 'x))


A more complete implementation of this algorithm can be found on the Scheme SICP Solutions wiki.

## Exercise 2.59

(define (union-set set1 set2)
(if (null? set1)
set2
(union-set (cdr set1) (adjoin-set (car set1) set2))))


## Exercise 2.60

The element-of-set? predicate remains unchanged. $$O(n)$$:

(define (element-of-set? x set)
(cond ((null? set) false)
((equal? x (car set)) true)
(else (element-of-set? x (cdr set)))))


Now that there is no need to check if the element to be adjoined is already in the set, adjoin-set is $$O(1)$$:

(define (adjoin-set x set)
(cons x set))


union-set is a straightforward append. $$O(n)$$:

(define (union-set set1 set2)
(append set1 set2))


intersection-set is the trickiest of the lot, in that we now have to handle duplicates. For example, the intersection of '(1 2 3 1) and '(2 4 1 2 1) should be '(1 2 1) - 1 occurs twice in both sets; 2 occurs twice in the second set, but only once in the first set.

For each element in set1, we call element-of-set? and the helper procedure remove, which are $$O(n)$$ operations. $$O(2n^2) \approx O(n^2)$$:

(define (remove x lst)
(cond ((null? lst) '())
((equal? x (car lst)) (cdr lst))
(else (cons (car lst) (remove x (cdr lst))))))

(define (intersection-set set1 set2)
(cond ((or (null? set1) (null? set2)) '())
((element-of-set? (car set1) set2)
(cons (car set1) (intersection-set (cdr set1) (remove (car set1) set2))))
(else (intersection-set (cdr set1) set2))))


## Exercise 2.61

Like element-of-set?, we should expect adjoin-set to examine about half of the elements in the set on average.

(define (adjoin-set x set)
(cond ((null? set) (list x))
((= x (car set)) set)
((< x (car set)) (cons x set))
(else (cons (car set) (adjoin-set x (cdr set))))))


## Exercise 2.62

$$O(n)$$ implementation:

(define (union-set set1 set2)
(cond ((null? set1) set2)
((null? set2) set1)
(else (let ((x1 (car set1)) (x2 (car set2)))
(cond ((= x1 x2)
(cons x1 (union-set (cdr set1) (cdr set2))))
((> x1 x2)
(cons x2 (union-set set1 (cdr set2))))
((< x1 x2)
(cons x1 (union-set (cdr set1) set2))))))))


## Exercise 2.63

Part (a): both procedures produce the same result - an in-order traversal of a tree.

; tree->list-1 abbreviated to tl
(tree->list-1 '(7 (3 () ()) (9 () ())))
(tl '(7 (3 () ()) (9 () ())))
(append (tl '(3 () ())) (cons 7 (tl '(9 () ()))))
(append (append '() (cons 3 '())) (cons 7 (append '() (cons 9 '()))))
'(3 7 9)

; tree->list-2 abbreviated to tl
; Internal procedure copy-to-list abbreviated to ctl
(tree->list-2 '(7 (3 () ()) (9 () ())))
(tl '(7 (3 () ()) (9 () ())))
(ctl '(3 () ()) (cons 7 (ctl '(9 () ()) '())))
(ctl '() (cons 3 (ctl '() (cons 7 (ctl '(9 () ()) '())))))
(ctl '() (cons 3 (ctl '() (cons 7 (ctl '() (cons 9 (ctl '() '())))))))
(cons 3 (cons 7 (cons 9 '())))
'(3 7 9)


Part (b): both procedures have the same order of growth $$O(n)$$, as every node in a tree is visited once.

## Exercise 2.64

Part (a): the list is divided into three parts: the center element this-entry, all elements smaller than this-entry, and all elements larger than this-entry. A tree is created with this-entry as its root node, a left-tree, which is generated by passing all the smaller elements into a recursive call to partial-tree, and a right-tree, which is generated by passing all the larger elements into a recursive call to partial-tree.

   5
/ \
/   \
1     9
\   / \
3 7   11


Part (b): order of growth is $$O(n)$$, as each element in the list is visited once and only one operation cons is performed for each visit.

## Exercise 2.65

Note: this is my very lazy approach to solving this problem. There are better solutions out there that avoid converting between representations of sets.

The previous implementation of union-set is now an internal procedure union-set-int. The arguments to this internal procedure are first converted from trees to lists, and its result is then converted from a list back to a tree:

(define (union-set set1 set2)
(define (union-set-int set1 set2)
(cond ((null? set1) set2)
((null? set2) set1)
(else (let ((x1 (car set1)) (x2 (car set2)))
(cond ((= x1 x2)
(cons x1 (union-set-int (cdr set1) (cdr set2))))
((> x1 x2)
(cons x2 (union-set-int set1 (cdr set2))))
((< x1 x2)
(cons x1 (union-set-int (cdr set1) set2))))))))
(list->tree (union-set-int (tree->list-2 set1)
(tree->list-2 set2))))


Ditto for intersection-set:

(define (intersection-set set1 set2)
(define (intersection-set-int set1 set2)
(if (or (null? set1) (null? set2))
'()
(let ((x1 (car set1)) (x2 (car set2)))
(cond ((= x1 x2)
(cons x1 (intersection-set-int (cdr set1)
(cdr set2))))
((< x1 x2)
(intersection-set-int (cdr set1) set2))
((< x2 x1)
(intersection-set-int set1 (cdr set2)))))))
(list->tree (intersection-set-int (tree->list-2 set1)
(tree->list-2 set2))))


## Exercise 2.66

Implementation of lookup where set-of-records is structured as a binary tree, ordered by the numerical values of the keys:

(define (lookup given-key set-of-records)
(let ((entry-key (key (entry set-of-records))))
(cond ((null? set-of-records) false)
((= given-key entry-key)
(entry set-of-records))
((< given-key entry-key)
(lookup given-key (left-branch set-of-records)))
((> given-key entry-key)
(lookup given-key (right-branch set-of-records))))))


## Exercise 2.67

(define sample-tree
(make-code-tree (make-leaf 'A 4)
(make-code-tree
(make-leaf 'B 2)
(make-code-tree
(make-leaf 'D 1)
(make-leaf 'C 1)))))

(define sample-message '(0 1 1 0 0 1 0 1 0 1 1 1 0))

(decode sample-message sample-tree) ; '(A D A B B C A)


## Exercise 2.68

We are given:

(define (encode message tree)
(if (null? message)
'()
(append (encode-symbol (car message) tree)
(encode (cdr message) tree))))


…and we have to define:

(define (encode-symbol symbol tree)
(if (or (null? symbol) (leaf? tree))
'()
(cond ((memq symbol (symbols (left-branch tree)))
(cons 0 (encode-symbol symbol (left-branch tree))))
((memq symbol (symbols (right-branch tree)))
(cons 1 (encode-symbol symbol (right-branch tree))))
(else (error "bad symbol: ENCODE-SYMBOL" symbol)))))


Testing our procedure:

1. it should signal an error if the symbol isn’t in the tree.
2. encoding the result from exercise 2.67 should yield a result identical to the original sample message.
(encode-symbol 'A sample-tree) ; 0
(encode-symbol 'B sample-tree) ; 10
(encode-symbol 'C sample-tree) ; 111
(encode-symbol 'D sample-tree) ; 110

; Should signal error
; (encode-symbol 'E sample-tree)

; Should be equal
(equal? (encode '(A D A B B C A) sample-tree)
'(0 1 1 0 0 1 0 1 0 1 1 1 0))


## Exercise 2.69

We are given:

(define (generate-huffman-tree pairs)
(successive-merge (make-leaf-set pairs)))


…and we have to define:

(define (successive-merge leaf-set)
(if (= (length leaf-set) 1)
(car leaf-set)
(successive-merge
(adjoin-set (make-code-tree (car leaf-set)
(cadr leaf-set))
(cddr leaf-set)))))


The trick with this procedure (which I completely missed, until I checked my answers here) is to adjoin the result of make-code-tree (i.e. the node created from merging the nodes with the two smallest weights) with the remainder of leaf-set.

The resulting (ordered) set, containing the new node, is then passed into a recursive call to successive-merge:

; leaf abbreviated to lf
(successive-merge '((lf A 3) (lf B 5) (lf D 6) (lf C 6)))
(successive-merge (adjoin-set (make-code-tree '(lf A 3) '(lf B 5)) '((lf D 6) (lf C 6))))
(successive-merge (adjoin-set '((lf A 3) (lf B 5) (A B) 8) '((lf D 6) (lf C 6))))
(successive-merge '((lf D 6) (lf C 6) ((lf A 3) (lf B 5) (A B) 8)))
...
(successive-merge '(((lf A 3) (lf B 5) (A B) 8) ((lf C 6) (lf D 6) (C D) 12)))
...
'(((lf A 3) (lf B 5) (A B) 8) ((lf D 6) (lf C 6) (D C) 12) (A B D C) 20)


## Exercise 2.70

(define rock-tree
(generate-huffman-tree '((A 2) (BOOM 1) (GET 2) (JOB 2)
(SHA 3) (NA 16) (WAH 1) (YIP 9))))

(display (encode '(GET A JOB
SHA NA NA NA NA NA NA NA NA
GET A JOB
SHA NA NA NA NA NA NA NA NA
WAH YIP YIP YIP YIP YIP YIP YIP YIP YIP
SHA BOOM)
rock-tree))

; '(1 1 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1
;   1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1
;   0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 1 0 1 1)


$$log_2{8} = log_2{2^3} = 3$$ bits per symbol for fixed-length encoding. Smallest number of bits required is $$3 * 36 = 108$$.

## Exercise 2.71

For $$n = 5$$:

((A 1) (B 2) (C 4) (D 8) (E 16))

    . {A B C D E} 31
/ \
/   \
E 16   . {A B C D} 15
/ \
/   \
D 8   . {A B C} 7
/ \
/   \
C 4    . {A B} 3
/ \
/   \
A 1   B 2


Skipping $$n = 10$$ as it’s more of the same.

$$1$$ bit is required to encode the most frequent symbol; $$(n - 1)$$ bits for the least.

## Exercise 2.72

For the most frequent symbol: $$O(1)$$, as the (only) branch that needs to be searched contains only one symbol.

For the least frequent symbol: $$O(n^2)$$ worst-case, as we have to search for the symbol in the sets of symbols in the left and right branches at each level.