1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
|
# Development log
# November 19th, 2025
## Rhythm
Developing a theory of rhythm for unheard.
There is a natural tension between relative and
absolute positioning of time objects in time.
Let me try to define relative positioning.
Absolute positioning is positioning a time object at an absolute
location on a timeline. For example, an absolutely positioned
time object might span [1 4]. Part of the definition might include
the constraint that the position of absolutely-positioned
time objects is known at compile time.
A relatively positioned time object is positioned relative to some
other entity (a container or another time object, for example).
A time object is relatively positioned if the related entity's
position is not known at compile time.
Maybe a more interesting differentiation is time objects that
are known to exist at compile time vs time objects that are not.
Relative positioning makes the following challenging:
-
Relative positioning makes seeking challenging, since you have
to play through the timeline to get to an event. It is also
not obviously clear how temporal indexing would work with
Absolute positioning is substantially less flexible. How do
loops work
I think this is probably one of the hardest pieces for me to get right. I'll be doing a lot of writing to help me better understand the problem space. Here are some of my high-level goals:
- A composer should be able to decompose their composition into phrases, just like a programmer can decompose a program into functions. (Hint: phrases are functions!) A composer should be able to work on a phrase in isolation, at times iterating upon it without regard for the entire composition. At other times, it should be easy for the composer to hear a phrase in concert with other phrases.
- It should be intuitive for a composer to write a piece that changes between time signatures.
- It should be intuitive for a composer to express polyrhythms.
- The playback engine should provide affordances for starting loops or phrases at the next downbeat. This interacts with the polyrhythm constraint in various ways. Regardless, I want to minimize surprise.
- Seeking (that is, jumping to a point in a composition) should be instantaneous. This has far-reaching consequences: in particular, it precludes certain kinds of iterative composition, where state t2 = f(state t1).
- Looping should be easy and flexible. Note, though, that the ban on iterative composition imposed by the previous bullet point seems to imply that looping can't exist at all! Fortunately, I have some theories for how to resolve this conflict. They need to be worked through, though.
- It should be possible to slow down the tempo of a song to (nearly) arbitrarily small values. Same for speeding up. This would allow for some kind of rhythmic fractals, where tempo slows down forever (while new subdivisions of tempo appear continuously). I've been looking into how "infinite zoom" fractal renderers represent zoom steps numerically, and hope to lift some of that into the playback engine.
- Related to the above, I want to provide some kind of mechanism for expressing recursive rhythms – that is, a rhythm that plays concurrently at tempo t, 2t, 1/2t, 4 t, 1/4t, etc. (A composer would specify the number of iterations they want to play above and below t.) This mechanism would account for tempo zooming automatically.
As the theory of rhythm develops, I'll find that some of the above goals are fundamentally incompatible, so I'll have to make choices. That will be part of the fun.
## Key questions
- The monotonic clock is ticking. That's cool. But what is the
relationship between monotonic clock time and a note's actual input?
That is, how does a clock (wall or pulse) query a note's presence?
Especially when they are composed?
## Things I'm pretty sure about
- Base unit should be a beat, and durations should be expressed
as fractional beats (e.g. 1/4, 2/1).
- Take a look at my "static" topology diagram, and consider that
flows can bind to positions. Given that, maybe it is possible
to have dynamic topologies. Put differently: As long as every
dynamic topology input flow is continuously defined, then going
"back in time" can mean "reversing the current state". This
_DOES NOT_ mean reversing time, or even guarenteeing that what
you play backward is the same as what you play forward.
What might happen when you hit play under this model?
You pass a flow into a player
The player sets time to init-time
The pulse clock starts pulsing
## Random notes
What is the "smallest time unit" that we want to represent?
We have to specify the number of "smallest time units" in a pulse.
I _think_ this is tightly tied to recursive zooming in. We can
specify durations as [128th notes, zoom level]
We increment zoom level when zooming in
We can set a "max frequency" for control information (say 100hz).
Then, our time-recursive function can short-circut any recursive objects
whose children have a minimum frequency that falls below the max frequency.
This will require that time objects emit min-frequency metadata,
which can be derived by finding the longest child.
This is a wild theory - test this tomorrow.
Actually, a better solution would be to just put the composer in control.
The composer should specify that they want n doublings or halvings of
a given phrase played concurrently.
## Solving for:
## Time signature changes
How does this interact with looping?
## Concurrent phrases with differing time signatures
It should be possible for concurrent time signatures to emit
their:
- Offset
- Numerator
They can be then merged together to create a data structure
containing lazy seqs of "beats" at each combination
Actually, just provide a function that takes offsets and numerators
and the "combination" that you're looking for, and returns
a lazy seq of beating indices
## Tick frequency
Monotonic clock frequency is tied to doubling or halving of tempo,
along with "discrete nyquist"
### Looping
What would it mean for each "phrase" to have its own derived timeline?
(Maybe not literally a phrase, maybe some container designed for this).
and what would it mean if these phrases could have timeline offsets
that repeat?
### Jump back
Similar to a repeat in music theory. A timeline GOTO.
### Static repeat
Not actually a loop. This is calling a phrase again and again,
adding the phrase's length to the offset at each repetition.
## Phrase isolation
It should be easy to play a phrase by itself. In fact, it should be possible to
loop a phrase so yo can just hear it while you're working on it
Should create repl.play and repl.repeat:
(play phrase)
(play phrase 130)
(repeat phrase)
(repeat phrase 130)
(stop)
### Start a phrase at the next downbeat
### Start a phrase at the next polyrhythmic co-downbeat
### Abstract phrases away as functions
### Make seeking both easy to reason about and performant
### Sensible reverse playback (rolling a tempo from pos to neg)
### Express fractional notes sensibly
# November 20th, 2025
## Summary of yesterday's "theory of rhythm" noodling:
Must support:
- Playing a phrase in isolation
- Looping
- Seeking
- Time signatures and time signature changes
- Coming in at the next downbeat
Don't box out:
- Polyrhythms
- Unlimited slowing and speeding
Looping means a few things:
- Jumping back, identical to a repeat in music notation. Timeline goto.
- Static repeat. That is, play phrase x n times. Note that this is is
repeating, not looping.
Insights:
- Don't support iterative composition, where f(t2) = f(f(t1)).
Iterative composition makes seek time linear with composition
length.
- The monotonic clock frequency is closely related to the maximum
number of events per second that we want to emit.
- Theoretically, infinite zoom could automatically derive the number
of doublings and halvings to play by looking at the monotonic
clock frequency and the longest event in a phrase. Doublings
that would result in no event changes due to all events falling
below the monotonic clock sampling rate could stop upward
recursion. This seems complicated to implement in practice,
but would technically work. (Think nyquist.)
- Concurrent time signatures could emit their start offset and
numerator. We could provide a function that takes offsets and
numerators and returns a lazy seq of polyrhythmic downbeat
positions.
Open questions:
Should base unit should be a beat, with durations based
as fractional units of a beat (e.g. 1/4, 4/1)? Or should base unit
account for time signature denominator? My hunch is that the internal
representation should be fractional beat, with helper functions that
convert current time signature to fractional beats.
## More brainstorming
Let me try to describe how the various decisions suggested above
could compose. I'll start at the end, and work backward.
You have a musical phrase representing your composition
or a part of it, and you want to play that phrase. The phrase
is a `timeline` object. The playback engine will query the
timeline object.
First,
you connect it to the playback engine:
```
(on-deck phrase input-map-f output-map-f)
```
`input-map-f` is a function that accepts the playback engine's available input
subsystems (initially just `{:midi midi-message-flow}`, but potentially
containing things like dmx and osc,) and returns a flow of a map wiring up the
input arguments of the composition.
Now, play the song:
```
(play 120)
```
`play` starts the clock, querying the timeline object as needed.
```
(pause)
```
`pause` pauses the piece.
```
(resume)
```
`resume` resumes the piece.
```
(play)
```
`play` with no arguments starts the piece from the beginning.
```
(loop)
```
`loop` with no args automatically resets the playback
position to the beginning at the end of the piece.
```
(loop start end)
```
This form of `loop` loops a segment of the composition.
(Aside: A pulse clock emits integers at the pulse rate. The
emitted integer represents the current _zoom level_.)
Open question: The playback engine queries the timeline at clock
positions. How does this relate to zoom levels? Is zoom level
part of the query?
### Random thoughts
Right now, phrases are eagerly composed together. This means that
if you redefine part of a piece, you have to redefine all of its
dependents, too. This might make interactive development annoying.
# November 21st, 2025
## Key insights
There is no need for a minimum rhythmic division! Just use fractions of a beat,
all the way down to the time-object / interval tree. It seems obvious in
hindsight.
## New ideas
Phrases can be named (at invocation time, not definition time). This will allow you to quickly jump to a phrase.
Then, in the UI, we can the tell you where you are and where you can jump to.
Note that since phrases can be nested, phrase names are concatenated into a
vector of [outer inner], arbitrarily many deep.
Looping can be enabled / disabled for a named phrase.
When the playhrad rolls into an enabled loop, it will play until the end of the
phrase, at which point it will jump the timeline back to the start of the looped
phrase. This works with nested phrases / loops. If nested loops are enabled, the
innermost loop takes precedence.
## TODO:
Document the bug that could occur if two tim object flows invocations shared an identity
Write down idea about the difference between a tree and a dag
Write down the static topology decision / summarize theory of time.
## November 22nd, 2025
To think about:
- What about wanting to change note durations live? Coupling notes directly to time objects
seems like it might be too strict
- What if phrases / groups were also time objects? I feel like I'm fundamentally questioning everything :/
- Maybe create "static-note" which is a time object, and "dynamic note" which is not?
Maybe this panic is not a big deal. I think I should go ahead and continue with "note as fixed duration". After all, a note doesn't _have to_ play during the entire duration. It's more like it _can_. Swing, variability, etc. can all still be accomplished via other means.
## November 25th, 2025
Okay, time to write down some confusion.
What do all the time objects eventually get merged into?
Something with the following qualities:
- It is a data structure representing various output effect providers, e.g. MIDI
- Each output effect provider has its own... mergin semantics?
- Time objects are created and destroyed?
Is it possible / does it make sense to differentiate over the
stream of all events emitted from a phrase? One dimension to differentiate
over would be the flow associated with a particular time object. I already
know how to do this - I could modify the reconcile-merge function to accomplish this.
I think there is probably another dimension that I'm not thinking of. In my
original group operator, I emitted a value for content in the "enabled" state,
and I emitted the empty set in the "does not exist" state. My original `poly`
performed a set union of all active notes. This then would have been differerntiable;
I could have used the differentiate function from my new missionary.util.
What is less clear to me is how to achieve the same behavior with time objects.
I think that I could modify note to return e.g. the empty set in my note
function. But how would I then group them? Would I want to couple the
grouping context with note?
OH! Maybe we want to group-by e.g. :note? That is, each time object-producing
function would be able to direct its contents to a differentiable grouping operator
for that particular type?
Remember that this was the original definition of poly and group:
(defn poly
[& notes]
(m/signal (m/cp (apply union (m/?< (apply m/latest vector notes))))))
;; TODO: Group could actually wrap note, rather than using explicitly
;; WIll introduce a lot of GC churn, though
(defn group
[clock start end content]
(m/cp (let [content (m/signal content)]
(if (m/?< (m/latest #(<= start % end) clock))
(m/?< content)
(m/amb #{})))))
Oh, here's an idea. What if we merged the union semantics from poly with the
lifecycle semantics of reconcile-merge? That is, rather than emitting from
each time object's flow, we instead unioned the latest elements from each time object?
Can I do that?
Why did poly originally work?
Each note emitted either the empty set or a set containing its value.
Multiple notes' groups were merged emit-wise:
#{1} #{} #{} -> #{1}
#{1} #{2} #{} -> #{1 2}
#{} #{2} #{3} -> #{2 3}
That is, _each note's state_ was sampled on each emit. This is due to the
behavior of m/latest. latest is fundamentally not a differentiable operator.
The behavior of reconcile-merge is more like:
#{1} -> #{1}
#{} -> #{} ;; Failure! We forgot about the presence of #{1}
Could we turn that into:
{} -> #{}
{1 true} -> #{1}
{2 true} -> #{1 2}
{1 false} -> #{2}
{3 true} -> #{3}
Yes! We could! And in fact this is what the differentiate function does.
Now, how do we get {1 true} and {1 false}? That is, where do we define
that 1 has been created and then destroyed?
One of the things that m/latest relies on to work is the notion of "being
able to sample everything at once." That is, "sampling all notes at once.""
The set-events function is allowing us to group lifecycle events by time-object
flow.
We could use time-object lifecycle information to emit {id true} at the start
of a time object's lifecycle and {id false} at the end, but this would require
that we have a notion of identity for each time object. The question is, do
we always? Imagine (note ... (m/ap (m/?< clock))), e.g. the value of a note
is dynamic. Then, what is its identity?
Ah! So this is one of the core differences between emit-wise grouping using
m/latest, and differentiated lifecycles. The former allows for _anonymous_
object identities, while in the latter, objects really do need IDs.
But do the IDs need to be visible? I think maybe not. (might be wrong, though.)
It might be possible to create identities for them within the body of reconcile-
merge. (Generate one, then create :up, then include on emit, and then :down).
I like this idea.
## November 28th, 2025
Very happy with wednesday's progress. Time to nail down a few more things.
1. How should I match phrases up to input sources?
2. How should I account for there eventually being multiple input media
types, like OSC, DMX, etc., all with their own semantics?
3. How should input sources flow through phrase composition?
Currently, `note` and `phrase` are called differently in a way that doesn't
make much sense to me. I need to play with that syntax a bit, too.
There's another interesting question that is really sticky. Imagine for
a moment that I eventually want to take musical-like inputs from multiple
sources, e.g. midi and OSC. I want to perform a similar mapping on the
output side. Imagine I want to map from MIDI in to OSC out. This suggests
that using MIDI as my internal representation for harmonic information
probably isn't the best idea, even if it is the easiest. Another example
is non-standard tunings, or music where one wants frequency-level control.
Okay, so what would the internal representation of harmonic information
be? I want something fairly flexible but not crazy. Probably just
[frequency amplitude] to begin with.
Anyway. I need to start thinking about these things, which are all related.
Let's do some experimenting.
Here is a current example:
```clojure
(defn triad
[>c >tonic]
(phrase
;; This is a major cord,
;; held 32 32nd notes.
;; The tonic can vary.
(note >c 0 0 32 >tonic)
(note >c 0 0 32 (m/latest #(+ % 4) >tonic))
(note >c 0 0 32 (m/latest #(+ % 7) >tonic))))
(defn drums
[>clock]
(phrase (note >clock 1 1 1 (m/ap kick))
(note >clock 1 9 1 (m/ap kick))
(note >clock 1 17 1 (m/ap kick))
(note >clock 1 25 1 (m/ap kick))
(note >clock 1 1 1 (m/ap hat))
(note >clock 1 5 1 (m/ap hat))
(note >clock 1 9 1 (m/ap hat))
(note >clock 1 13 1 (m/ap hat))
(note >clock 1 17 1 (m/ap hat))
(note >clock 1 21 1 (m/ap hat))
(note >clock 1 25 1 (m/ap hat))
(note >clock 1 29 1 (m/ap hat))
(note >clock 1 5 1 (m/ap snare))
(note >clock 1 13 1 (m/ap snare))
(note >clock 1 21 1 (m/ap snare))
(note >clock 1 29 1 (m/ap snare))))
(defn song'
[>clock >tonic]
(phrase ((triad >clock >tonic) 0)
((triad >clock (m/latest #(+ % 12) >tonic)) 0)
((drums >clock) 0)))
```
Things that stand out:
1. It is interesting that `phrase` doesn't take a flow.
2. I don't like that `note` takes a midi :ch argument. Too
coupled with a particular output medium. (But I also don't want
to get _too far_ from the specific output medium and risk creating
surprising behavior.)
3. These examples aren't actually great because we don't actually do
much that is interesting with argument passing.
4. The invocation of triad and drums in song is gross.
5. It is unclear how happens-before relationships would be structured.
6. We pass clock in explicitly, but it isn't actually clear why.
7. Should we encourage phrases to use argument names that are coupled
to input types, or to give them names that are meaningful to the
music itself?
8. How do I want to specify outputs?
Here is a first pass at rewriting:
```clojure
(defn triad
[>tonic]
(phrase
;; This is a major cord,
;; held 32 32nd notes.
;; The tonic can vary.
(note 0 0 32 >tonic)
(note 0 0 32 (m/latest #(+ % 4) >tonic))
(note 0 0 32 (m/latest #(+ % 7) >tonic))))
(defn drums
[]
(phrase (note 1 1 1 (m/ap kick))
(note 1 9 1 (m/ap kick))
(note 1 17 1 (m/ap kick))
(note 1 25 1 (m/ap kick))
(note 1 1 1 (m/ap hat))
(note 1 5 1 (m/ap hat))
(note 1 9 1 (m/ap hat))
(note 1 13 1 (m/ap hat))
(note 1 17 1 (m/ap hat))
(note 1 21 1 (m/ap hat))
(note 1 25 1 (m/ap hat))
(note 1 29 1 (m/ap hat))
(note 1 5 1 (m/ap snare))
(note 1 13 1 (m/ap snare))
(note 1 21 1 (m/ap snare))
(note 1 29 1 (m/ap snare))))
(defn song'
[>tonic]
(phrase ((triad >tonic) 0)
((triad (m/latest #(+ % 12) >tonic)) 0)
((drums) 0)))
```
Looks cleaner without clock. This doesn't mean that
clock can't be passed explicitly, just that it isn't necessary
by default.
What about specifying outputs? Already there is an interesting
decision to make. On drums, would one want to treat each drum as
its own output, or treat them as simply different parts of the same
instrument?
What is an output? Is it an instrument? I think the only honest answer
is that this question can't be answered in general. Sometimes it will
make sense to think of a cow bell as an instrument, and sometimes as a
component of a percussion kit.
The point of an output, I think, is to somehow group output information
together. I think it is fair to say that everything on the same output
will share some semantics. From a programming perspective, the state
of an output should be uniform: a single data structure, and a single
merge operation.
The data structure and merge operation shouldn't be re-written for each
instrument. There will only be a few of these. Keyed instruments like
pianos will be able to share a single data structure and merge operation,
even if a harpsichord won't make use of velocity information.
When writing a phrase, one specify the data structure and merge
operation. This actually isn't a property of the phrase, though, it
is a property of the particular part of the phrase that is being
described. Remember that a phrase may have many instruments in it,
and therefore many data structures and merge ops.
This makes me wonder: beyond phrase and time object, is there another
concept that I should introduce? Something like "instrument"?
```clojure
(defn triad
[>tonic]
(phrase
(instrument keyboard
(note 0 32 >tonic)
(note 0 32 (m/latest #(+ % 4) >tonic))
(note 0 32 (m/latest #(+ % 7) >tonic)))))
(defn drums
[]
(phrase
(instrument percussion
(note 1 (m/ap kick))
(note 9 1 (m/ap kick))
(note 17 1 (m/ap kick))
(note 25 1 (m/ap kick))
(note 1 1 (m/ap hat))
(note 5 1 (m/ap hat))
(note 9 1 (m/ap hat))
(note 13 1 (m/ap hat))
(note 17 1 (m/ap hat))
(note 21 1 (m/ap hat))
(note 25 1 (m/ap hat))
(note 29 1 (m/ap hat))
(note 5 1 (m/ap snare))
(note 13 1 (m/ap snare))
(note 21 1 (m/ap snare))
(note 29 1 (m/ap snare)))))
```
Oh, interesting. `triad` isn't just a keyboard concept!
Instrument feels like a pretty specific name. I don't think I love that.
But it is bringing up an interesting thought. Phrases compose together,
but it seems like things like `notes` _do_ have some kind of wrapper
context that separates them for a phrase. We can actually see this distinction
repeated in the programming context, with the presence of `lift`.
Maybe `lift` and `instrument` are the same thing?
How would the generic concept of a triad be expressed outside
of the instrument context? Here's what I mean. I want to support the
creation of the music theory concept of a triad and import it into any
instrument such that it can be used in any phrase.
Oh, maybe the problem is that the triad helper function should actually just
be something that returns three flows, one for each note in the interval.
Or even a flow-returning function that takes a root and a degree?
e.g.
```clojure
(defn piano
[>root]
(let [triad (theory/triad :major >root)])
(phrase
(instrument keyboard :keyboard ;; keyboard is the name of a data structure / merge op;
(note 1 32 (triad 1)) ;; :keyboard is declaring an abstract output destination
(note 1 32 (triad 3))
(note 1 32 (triad 5)))))
(defn drums
[]
(phrase
(instrument percussion :drums
(note 1 (m/ap kick))
(note 9 1 (m/ap kick))
(note 17 1 (m/ap kick))
(note 25 1 (m/ap kick))
(note 1 1 (m/ap hat))
(note 5 1 (m/ap hat))
(note 9 1 (m/ap hat))
(note 13 1 (m/ap hat))
(note 17 1 (m/ap hat))
(note 21 1 (m/ap hat))
(note 25 1 (m/ap hat))
(note 29 1 (m/ap hat))
(note 5 1 (m/ap snare))
(note 13 1 (m/ap snare))
(note 21 1 (m/ap snare))
(note 29 1 (m/ap snare)))))
(defn song'
[>root]
(phrase ((piano >root) 0)
((piano (m/latest #(+ % 12) >root)) 0)
((drums) 0)))
```
Question: How would I handle the fact that triad might represent notes
as notes, while an output method might represent in frequencies? I think
I would want some kind of `note` protocol that can convert between different
representations.
I think it does make sense that, unlike phrases, outputs / instruments
can't be nested.
Can a phrase have multiple instruments / outputs? I think the
answer should probably be "yes". That would mean abstractions could
return instruments rather than phrases. This would be nice because
it would remove a layer of unnecessary naming in the naming tree. Example:
```clojure
(defn piano
[>root]
(let [triad (theory/triad :major >root)])
(instrument keyboard :keyboard ;; keyboard is the name of a data structure / merge op;
(note 1 32 (triad 1)) ;; :keyboard is declaring an abstract output destination
(note 1 32 (triad 3))
(note 1 32 (triad 5))))
(defn drums
[]
(instrument percussion :drums
(note 1 (m/ap kick))
(note 9 1 (m/ap kick))
(note 17 1 (m/ap kick))
(note 25 1 (m/ap kick))
(note 1 1 (m/ap hat))
(note 5 1 (m/ap hat))
(note 9 1 (m/ap hat))
(note 13 1 (m/ap hat))
(note 17 1 (m/ap hat))
(note 21 1 (m/ap hat))
(note 25 1 (m/ap hat))
(note 29 1 (m/ap hat))
(note 5 1 (m/ap snare))
(note 13 1 (m/ap snare))
(note 21 1 (m/ap snare))
(note 29 1 (m/ap snare))))
(defn song'
[>root]
(phrase :first-verse
((piano >root) 0)
((piano (m/latest #(+ % 12) >root)) 0)
((drums) 0)))
```
Okay, important to note here:
1. I think the name associated with the call to `instrument` is actually
setting the name on the implicit `phrase` that the call to instrument is
returning.
2. A goal that I am trying to achieve here is implicit, unambiguous naming. Here,
I've accomplished that for drums ([:first-verse :drums]) but not for piano.
Let's try again?
```clojure
(defn piano
[>root]
(let [triad (theory/triad :major >root)])
(instrument keyboard
(note 1 32 (triad 1))
(note 1 32 (triad 3))
(note 1 32 (triad 5))))
(defn drums
[]
(instrument percussion
(note 1 (m/ap kick))
(note 9 1 (m/ap kick))
(note 17 1 (m/ap kick))
(note 25 1 (m/ap kick))
(note 1 1 (m/ap hat))
(note 5 1 (m/ap hat))
(note 9 1 (m/ap hat))
(note 13 1 (m/ap hat))
(note 17 1 (m/ap hat))
(note 21 1 (m/ap hat))
(note 25 1 (m/ap hat))
(note 29 1 (m/ap hat))
(note 5 1 (m/ap snare))
(note 13 1 (m/ap snare))
(note 21 1 (m/ap snare))
(note 29 1 (m/ap snare))))
(defn song'
[>root]
(phrase
((piano >root) :piano-1 0)
((piano (m/latest #(+ % 12) >root)) :piano-2 0)
((drums) :drums 0)))
```
Here, `phrase` is returning an invokable object that receives a name. If I were
to repeat song twice, we would get:
```clojure
(defn song
[>root]
(phrase
((song' >root) :first-verse 0)
((song' >root) :second-verse 32)))
```
Now we have the following unambiguous output addresses:
{:first-verse [:piano-1 :piano-2 :drums]
:second-verse [:piano-1 :piano-2 :drums]}
Here's an aesthetic goal - make phrase invocation look more like:
```clojure
(defn song'
[>root]
(phrase
(piano >root :piano-1 0)
(piano (m/latest #(+ % 12) >root) :piano-2 0)
(drums :drums 0)))
(defn song
[>root]
(phrase
(song' >root :first-verse 0)
(song' >root :second-verse 32)))
```
I'm still not sure that I understand what a `note` is. What are other
objects like it? You could make something like a `pulse`, representing a
sine wave controlling an LFO. I guess from a programming perspective,
a note is an instance of some kind of object that becomes a signal
associated with output state and an ouput 'context' that explains how
similar objects get merged together. That output context receives a
name, and the name is guaranteed to be unique through multiple
nestings.
Here's a question. Should phrases be responsible for renaming their
children? This could help control name explosions, where a long composition
could have many many names to map before playback.
Yeah, that seems worth exploring. You could have some kind of rename map:
(phrase
{[:first-verse :piano-1] :piano-1
[:second-verse :piano-1] :piano-1
[:first-verse :piano-2] :piano-2
[:second-verse :piano-2] :piano-2
[:first-verse :drums] :drums
[:second-verse :drums] :drums})
This would mean that you could do
```clojure
(play {:piano-1 midi-out-1 :piano-2 midi-out-2 :piano-3 midi-out-3} song)
```
Big question: Why is there this difference between the way
I represent inputs (arguments) and outputs?
Questions from today that remain open:
1. What does this look like with a more complex example?
2. What do happens-before relationships look like? (Marco?)
3. Why do inputs look familiar (arguments) while outputs look strange (keywords)?
4. Do we need this instrument / output concept? What is it, exactly? What does it do?
5. How would a swung clock fit into all of this? Here again, time seems a little
different. Swung clocks need to interact with timeline selection.
Time objects should probably specify their clock. At playback time,
you define one or more clocks.
6. I don't curently have a good answer to crossfades.
7. Need to brainstorm more on loops / goto / portals and their relationship
with phrases.
Insights:
The point of an output, I think, is to somehow group output information
together. I think it is fair to say that everything on the same output
will share some semantics. From a programming perspective, the state
of an output should be uniform: a single data structure, and a single
merge operation.
The data structure and merge operation shouldn't be re-written for each
instrument. There will only be a few of these. Keyed instruments like
pianos will be able to share a single data structure and merge operation,
even if a harpsichord won't make use of velocity information.
From a programming perspective,
a note is an instance of some kind of object that becomes a signal
associated with output state and an ouput 'context' that explains how
similar objects get merged together. That output context receives a
name, and the name is guaranteed to be unique through multiple
nestings.
Music theory concepts like triad can be expsosed as flow-returning functions.
Outputs / instruments can't be nested, unlike phrases.
A phrase can contain multiple instruments / outputs.
Create a note protocol that generalizes over concrete note representation.
Phrases can simplify the names of their children by providing a rename map.
Concepts to test:
1. Confirm that working on a phrase in isolation and working on a whole piece are
similarly simple
Other random thoughts:
It seems worth giving time objects explicit
Every leaf must be a time object
At this point, time objects should contain other metadata, like their path from
enclosing phrases
Time objects and phrases should have a similar... calling convention? with regards to
offset (almost certainly) and duration (maybe)
Time objects will eventually need a serializable identity for display purposes
Both time objects and phrases should be reified for display/interactivity
purposes
## November 29th, 2025
As an exercise, I'd like to create a function that compiles strudel
mini notation down to my IR.
https://strudel.cc/learn/mini-notation/
https://strudel.cc/learn/mondo-notation/
https://strudel.cc/learn/factories/
https://strudel.cc/learn/time-modifiers/
And why this?
https://strudel.cc/learn/stepwise/
Also do a dissection of strudel's alignment system:
https://strudel.cc/technical-manual/alignment/
And this section on voicing:
https://strudel.cc/understand/voicings/
---
What would it mean to reify temporal specifiers, so that they become objects
which can be shortened / lengthened through a series of operations? Keeping in
mind that these little windows encapsulate arbitrary values. They would compose
into a callable which can somehow be invoked with the time object itself, e.g.
the note(s)
## December 1st, 2025
TODO upcoming:
- Inspired by strudel, define a language of musical modifiers
- Read notes of Nov. 28
|