I've been watching a couple of Bret Victor videos recently. In this video, he mentioned how there was a lot of hostility towards assembly when it was first introduced. Assembly allowed machine code to be represented in words, for example writing `add $t1 $t2 $t3` to mean add values in $t2 and $t3 then place it into $t1, instead of say `0000 0001 0000 1001 0101 0000 0010 0001`.
Despite improving developer productivity and reducing likelihood of errors, those who had been programming in binary didn't see much value to assembly. This resistance to new ways of working, to unlearn what you've already learned and think in new ways, is present in 2020 as it was in the 1950s. If anything, it's a more concerning anti-pattern today given how much faster the world is changing.
This thought motivated me to look into category theory. Perhaps understanding more theoretical aspects could help illuminate why things were implemented the way they did. Perhaps doing so would highlight concepts that we take for granted. In last week's post, I mentioned consuming more of Bartosz Milewski's content. I thought I'd complete all three parts before starting Haskell. I'm almost done with Part 1, and then realized I want a bit more intuition on what applying the theory looks like.
Instead of more Haskell, I actually found Chet Corcos' post on functional JavaScript super helpful. A pure function is a function where the return value is only determined by its input values, without dependencies or side effects. He describes how pure functions "bound the cognitive load of programming", because pure functions forces you to be concerned only with the body of the function.
A core concept in category theory is composition; the following is from the preface of Milewski's book.
Composition is at the very root of category theory - it’s part of the definition of the category itself. And I will argue strongly that composition is the essence of programming.
In Corcos' post, he emphasizes how writing code through composition of pure functions makes it much easier to trace errors - you'll get a stack trace through every function all the way to the source of the bug. This is in contrast to object-oriented programming, where the state of the object may not be exposed.
What's very cool is seeing how, in implementing a map operation of two successive functions on a list, the version applying a single map of the composition of the two functions is faster than the version applying two successive maps of each function individually! The sections on lazy loading (with similarly getting performance gains) are fascinating, though perhaps best described in the post itself.
Like all things in life, there's always trade-offs. Gabriel Gonzales' blog Haskell for All has an excellent post on how Haskell is better for long-term productivity because it enforces best practices. I first came across the Option type in Rust, perhaps this was inspired by Haskell's Maybe. In any case, this framing forces you to invest time at the start dealing with edge cases, as opposed to investing time later on when things break.
The post goes on to talk about the flip side of Haskell. I didn't realize Haskell implements its default String type as a linked list (perhaps a counter-example to those claiming linked lists only come up in interviews). While abstraction is a 'good thing', any good thing in excess is bad - Haskell makes advanced abstraction syntactically cheap and thus easy to get carried away. Amusingly there's even a Hitler parody for that...
Content: Naval Ravikant podcasts
There are a number of content formats I've collected but not enough for a series. I'll feature them this week.
First is podcasts. I enjoy listening to Naval Ravikant, the co-founder of AngelList but perhaps best known for his How to Get Rich series of tweets. On Samantha Ryan's podcast, his response to the question of how we decide what to do with our limited time (at 25:39) emphasizes the importance of being honest and effective.
It's very important to be honest with yourself so you can actually succeed without hurting yourself, without being in self conflict all the time and without being ineffective. Because if you're effective and you just get what you want, you can trade it for other things.
On Tim Ferriss' podcast, he elaborates a tweet in the series which talks about gaining financial freedom through ownership of equity. The discussion (at 31:22) echoes Bret Victor above, on the hard part of any endeavor is getting over one's resistance to change.
The hard parts are not the learning, it is the unlearning. It’s not the climbing up the mountain. It’s the going back down to the bottom of the mountain and starting over.
It’s the beginner’s mind that every great artist, or every great business person has, which is: you have to be willing to start from scratch. You have to be willing to hit reset and go back to zero. Because you have to realize that what you already know, and what you’re already doing, is actually an impediment to your full potential.
Content: Sam Altman talks
On the theme of inertia, I learned a lot at Square but I did go through a phase of feeling comfortable. Towards the end of that phase I realized that by not pushing harder I did a disservice to the company, and perhaps more importantly, I did a disservice to myself.
I thought a change of environment would be a good wake up call. I moved from the 3,000+ person company to a 30+ startup. To help prepare the transition I listened to a number of Y Combinator talks, in particular the ones with Sam Altman. This talk is from Work at a Startup Expo 2018 (at 16:20).
Every job I've ever had I've been wildly unqualified for, and doing that is the #1 secret to having a really great career. That's the way you have a super fast rate of personal growth, and I think the way careers go is you should put in the most of the effort at the beginning. Because it's this compound interest-like thing, where the work you do now, the learning the you do now, the improvement you make early in your career gets to pay off for all the rest.
The talk from How to Succeed with a Startup discusses staying positive as a team, stepping up and a bias towards action (at 8:17).
The spirit of "we'll figure it out" is my favorite thing to hear among early startup team members. A lot of things go wrong, the situations that startups win in tend to be incredibly dynamic, and so this idea that even if I'm not qualified on paper, even if I haven't solved this problem before, even if this problem feels like it's going to kill the company (which many problems will feel that way), the spirit among the team of "we've got the people we need, we're gonna figure this out, we're gonna get this done", that's super important.
Content: Credit card processing flow
The final content format is static visuals. I know, spending time in payments makes this flowchart more interesting for me personally. It's also curious that this comes up in an article about Chinese payment apps.
There's a weekly book club at RC on Martin Kleppmann's Designing Data-Intensive Applications. Recent discussions reminded me of things I learned at Bradfield; it's been a fun process of rediscovery.
The first is on how caching affects memory access. Let's start with Moore's law. In the 1960s, everything was slow. Processors got a speedup in over the 70s and the 80s, memory access too but not to the same degree. By the 90s we have a real gap - it's fast to process data but slow to access data, say by a factor of 100.
How did we bridge this gap? Caching. The idea is if it's slow to go all the way to RAM for data, let's move it a bit closer. This was achieved with a piece of dedicated hardware called the cache. Thus we have a 'memory hierarchy' - first registers, next cache, then RAM. Estimates of access times can be found here.
The cache retrieves data a bit like how we get library books. Instead of getting one book at a time, we would get a couple of books at one go. The cache's analogy to the number of books is what's called a cache line. For simplicity let's just say that 64 bytes of data is retrieved at a time and stored on the cache.
Thoughts about caching came up when we were discussing database indexing. We can index a database with binary search trees, but this is inefficient because the node of a binary tree occupies only a fraction of the cache line. This means there'll be more cache misses (when the data we're looking for is not in the cache), compared to cache hits.
B+ trees remedy this by increasing the branching factor, i.e. the number of references to child nodes in each node of the tree. This ensures the whole cache line is used up. In practice the branching factor depends on the amount of space required to store node references and the range boundaries. A toy implementation of a B+ tree in Python can be found here.
Encoding
The second is on encoding. Computers store data in binary, i.e. a collection of 1s and 0s, but binary data by itself doesn't tell us what it means. We could think about it as a series of Booleans. We could also 'chunk' the binary data 8 at a time and have each chunk represent a character. This encoding is called ascii, and can be found by typing `man ascii` in the Terminal. Each ascii character is one byte i.e. 8 bits.
Suppose we were to send integer values to each other, say the number 1000. We could do it in 'text' format, where each character takes up one byte. This means we'll need 4 bytes for "1000". Alternatively we could also send it in 'binary' format, where 4 bytes can range from -2,147,483,648 to 2,147,483,647 (i.e. the range of signed integers given 2^32). In other words, sending integers in binary format is a lot more compact.
Thoughts about encoding came up when we were discussing data encoding formats. A commonly-used format is JSON, which is all text. This keeps things simple but the lack of compactness means it'll be slower to send over the wire. Protocol Buffers, commonly-known as protobuf, allows integers to be encoded in binary format and thus more compact over the wire. This makes a real difference not just for integers but also for binary attachments, like images.
A simple binary encoding is Bencode, which is the encoding used by BitTorrent. Bencode supports strings, integers, lists and dictionaries. Alas it's not that widely used, despite being a "human-readable" binary encoding format. A toy implementation of Bencode can be found here.
Functional programming
It's hard to miss the excitement around Haskell at RC. A few weeks back I told myself I'd like to spend a bit of time on it at my next RC to see what the hype is about. Then I found myself at a stopping point on other projects, and conveniently had Haskell lined up.
Like most things in life, plans only go so far. I came across category theory while reviewing RC discussions on Haskell. I watched the first video in Bartosz Milewski's Category Theory for Programmers series; I was hooked.
I'll admit part of it is Milewski's infectious enthusiasm on the subject. In the preface of his book, he describes how category theory is a "treasure trove of extremely useful programming ideas", which inspired a lot of ideas in Haskell. It's an area of math that's unlike calculus of algebra, because instead of dealing with particulars, it deals with structure. It deals with the kind of structure that makes programs composable.
What I also found quite interesting in Milewski's thread on the 'rise' of functional programming. For a long time we were happy with imperative programming, and use object-oriented programming as the paradigm for abstraction by hiding complexity (in particular, hiding state) inside the object.
The 'end' of Moore's law has since brought multicore processors (on this topic, Sophie Wilson has an excellent talk here). However, imperative programming allows for side effects and hiding state in objects makes data races more likely. While we can carry on doing what we've been doing thus far, it'll be an ongoing struggle in this new concurrent world.
Why needlessly endure complexity? Why not simply shift to functional programming paradigm, where objects are immutable and functions are pure? Instead of building on shaky ground, Milewski's rallying cry is to fix the foundations in order for us to all more effortlessly move forward.
Content: The moral bucket list
David Brooks disarmed me with "I’m a pundit, more or less paid to appear smarter and better than I really am", yet this essay makes me pause to reflect every time.
When I meet such a person it brightens my whole day. But I confess I often have a sadder thought: It occurs to me that I’ve achieved a decent level of career success, but I have not achieved that. I have not achieved that generosity of spirit, or that depth of character.
Content: How to interview your interviewers
I've started doing interviews recently. Thinking about interviews as a chat to get to know potential colleagues can help take the edge off.
That’s because finance, like other forms of human behavior, underwent a change in the twentieth century, a shift equivalent to the emergence of modernism in the arts—a break with common sense, a turn toward self-referentiality and abstraction and notions that couldn’t be explained in workaday English.
In poetry, this moment took place with the publication of “The Waste Land.” In classical music, it was, perhaps, the première of “The Rite of Spring.” Jazz, dance, architecture, painting—all had comparable moments. The moment in finance came in 1973, with the publication of a paper in the Journal of Political Economy titled “The Pricing of Options and Corporate Liabilities,” by Fischer Black and Myron Scholes.