aboutsummaryrefslogtreecommitdiff
path: root/readme.md
blob: da514c7c2d1c43f99771265454ad14c54bddb52c (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# BLoCade

[BLoC](https://github.com/marvinborner/bloc) is a file format for shared
binary lambda calculus (BLC).

BLoCade, the BLoC-aid, turns BLoC files back into executable files
(targets). This is useful for [bruijn](https://bruijn.marvinborner.de)
(see [wiki](https://bruijn.marvinborner.de/wiki/coding/compilation/)),
benchmarking, or general term optimization.

## Targets

- BLC (sharing by abstraction): Terms having BLoC entry indices get
  abstracted and applied to the respective term. The indices get
  converted to De Bruijn indices. The dependency graph is resolved by a
  topological sort. Flag `bblc` (bits) and `blc` (ASCII 0/1).
- BLC (unshared): Every BLoC entry gets reinserted into the original
  term. Do not use this if you want efficiency or small files. Flag
  `unbblc` (bits) and `unblc` (ASCII 0/1).
- Planned: “normal” programming languages

## Benchmarks

Some deliberately unoptimized test cases from `test/`, evaluated using
`./run` and measured in bits (using `bloc -m 10 ...`):

| file          | bloc | unbblc/orig | bblc  |
|:--------------|:-----|:------------|:------|
| echo          | 56   | 4           | 4     |
| reverse       | 248  | 172         | 172   |
| churchadd1    | 296  | 390         | 159   |
| churchadd2    | 368  | 433         | 207   |
| churcharith   | 560  | 1897        | 339   |
| uni           | 576  | 459         | 448   |
| ternaryvarmod | 1248 | 2330        | 1824  |
| collatz       | 1904 | 2884        | 1986  |
| ternaryadd    | 2240 | 19422       | 2708  |
| ternaryfac    | 2792 | 9599        | 3091  |
| aoc           | 7480 | 50062       | 10861 |

*Huge* and convoluted programs like
[@woodrush](https:://github.com/woodrush)’s
[lambda-8cc](https://github.com/woodrush/lambda-8cc) will not get
smaller by compiling them to shared BLC. The *many* sharable terms get
abstracted to so many levels that the (unary) BLC indices (in case of
8cc) can literally be 50 kilobit long. One solution would obviously be a
BLC variant with binary indices, or just storing and interpreting such
files in the BLoC format directly (which also uses binary reference
indices). A better algorithm than standard topological sort for
resolving the dependencies would also help.

8cc, measured in megabit:

| file       | bloc | unbblc/orig | bblc    |
|:-----------|:-----|:------------|:--------|
| lambda-8cc | 5.13 | 40.72       | 3267.88 |

## Optimization

You can use BLoC’s minimum tree size deduplication parameter
`--min-size`/`-m` to generate even smaller files. The `optimize` bash
script tries to find the optimal parameter for your target file using a
binary search.

For example, 8cc’s optimal parameter for the bblc target is `-m 4587`,
only marginally reducing the original size:

| file       | bloc | unbblc/orig | bblc  |
|:-----------|:-----|:------------|:------|
| lambda-8cc | 5.8  | 40.72       | 40.65 |