RC W8 - Think before you build

Compilers

Lesson of the week - when screen sharing on Zoom, disconnect additional monitor(s).

I've been working on porting Crafting Interpreters from C to Python, and this naturally involves making some adjustments. First is avoiding circular dependencies in the Python version. I imagine this is less of an issue in C as the linking/compilation stage simply moves all the source code into a single blob. Second is separating out pointers to arrays into arrays and arrays indices.

I'm happy with the progress so far, but now I'm at implementing functions and the interactions are getting a lot more complex. In particular is how variable scopes interact with function call frames; I'm constantly puzzled on how to debug (and if my adjustments overlooked more fundamental underpinnings of how things work). I know, last week I was all about "let's set up guardrails" to make reasoning about the code easier - it's all there now and I'm still stuck.

I'm contemplating starting over with my own custom implementation, but with function support right out of the door (or at least, with the minimal infrastructure needed for to the implementation). I anticipate this involves a lot more sketching designs on paper even before the first line of code - how fun!

The other fun thing with creating your own toy implementation is selecting your own set of reserved words. I've switched from 'var' to 'let' (to suppress prior trauma of learning JavaScript pre-ES6) and from 'this' to 'self'.

WebAssembly

New technology can be rough around the edges. Earlier in my time at RC I wanted to set up Python-Rust interop via WebAssembly. This involves compiling Rust functions to WebAssembly and loading the .wasm binaries in Python, in order to benefit from performance improvements.

I finally got this to work, example here. Calculating the 10,000th prime in Python with Rust achieves a 10x speedup vs pure Python. What's interesting is this closed issue, which might have been what tripped me up previously. The issue close date? August 24, 2020.

Content: What you'll wish you'd known

Paul Graham was invited to speak at a high school in 2005, but somehow the school authorities vetoed the plan. I wonder if they're kicking themselves now. 

http://www.paulgraham.com/hs.html

The excerpt below is a highlight from a recent re-reading of the talk he prepared. On the back of this I'm going to indulge myself on something I've been curious about for a while - design.

If you're deciding between two projects, choose whichever seems most fun. If one blows up in your face, start another. Repeat till, like an internal combustion engine, the process becomes self-sustaining, and each project generates the next one. (This could take years.)

The excerpt that caught my eye on the first reading emphasizes the importance of experience, and draws parallels with Rilke's quote.

If it takes years to articulate great questions, what do you do now, at sixteen? Work toward finding one. Great questions don't appear suddenly. They gradually congeal in your head. And what makes them congeal is experience.

Content: Experiments at Airbnb

Data science at Airbnb had a mixed reputation a few years back, but something they did very well was marketing the practice through content and events. The well-curated blog attracted many to apply for a role. The post I enjoyed the most discussed best practices for A/B testing - on duration of experiments, understanding context and setting up guardrails.

https://medium.com/airbnb-engineering/experiments-at-airbnb-e2db3abf39e7

Speaking of experiments and causal inference, I'm adding causal forests to the list of things to review...

Content: Indie Game

I'm fascinated how distribution (or perhaps in a more direct way, monetization) plays a role in content creation. Jonathan Blow in his talk discusses parallels between TV shows and computer games when each medium moves from gatekeepers to direct-to-consumer distribution - highly recommended.

What's also super fascinating is seeing the travails of indie game developers. The attention to detail, the degree of craftsmanship and the pursuit of perfection - I can't help but think of Jiro.

Content: Empty Streets (Haji + Emanuel Remix)

This mix is legendary, and amusingly, apt for 2020.

Content: Think before you build

I remember reading this post, spending hours on Google looking for it again, and feeling very sad when I couldn't find it. When I was compiling the content I wanted to share on this blog, I looked through my notes and there it was. The lone URL, no annotations, no comments. It was like finding treasure. I was overjoyed.

Perhaps there's a dream job there somewhere. Content finder?

http://www.slate.com/id/2289527

The post is about how computers have done wonders for productivity, but in many cases speed compensates for the lack of rigorous thought. It's a reminder to pause and reflect before we write that first line of code, and as per a previous post, to cut through to what matters by thinking clearly from first principles.

What kept me looking for this post were the immortal words of Renzo Piano.

But architecture is about thinking. It's about slowness in some way. You need time. The bad thing about computers is that they make everything run very fast, so fast that you can have a baby in nine weeks instead of nine months. But you still need nine months, not nine weeks, to make a baby.

RC W6D4 - The engineer's guide to career growth

Compilers

Learning about compilers has been fun. We started with a stack-based virtual machine that executes bytecode, just completed the scanner (converts source code into tokens), and now getting into the compiler (converts tokens into bytecode).

Right now things still feel a little removed from the Fitzgerald vs Egorov post. The optimization that stood out to me was inlining, where the compilation process replaces a function call with the body of the function itself (since you don't have functions in machine code). Egorov achieved this in JavaScript by stringifying a function to get its source text; Fitzgerald noted in Rust this simply involves annotating the function with #[inline].

Stack-based VMs are simple yet can elegantly do a lot! Register-based VMs, in contrast, are more complex but can achieve better performance. I amuse myself thinking about the time I reviewed my notes on MIPS (register) thinking it could help me understand WebAssembly (stack). Bob Nystrom's all-time favourite CS paper is actually on this topic, on Lua moving from a stack-based to a register-based instruction set.

Professional services

Paul Graham's essay described how the 1980s saw a shift from large corporations being the most desirable employers, to professional services. I think of this topic fondly. My very first job was in finance, which was the default of sorts for college graduates then.

The progression in most professional services roles has junior employees executing what the client asks for, and senior employees managing client relationships. I liked the execution part, and always wondered, "What if I'm not into golf?" (cue Mad Men episode). It's curious how for some cases this model is flipped - Google ICs goes all the way up to Senior Fellow i.e. Level 11.

Content: Engineering management

We've covered data science, product management, design, dev ops and data engineering so far. I've saved this one for last - engineering management. I love Julia Evans' zine on the topic, as well as this excellent post by Raylene Yung.

https://firstround.com/review/the-engineers-guide-to-career-growth-advice-from-my-time-at-stripe-and-facebook

I wouldn't say I have a burning desire to be a manager. That said, I do feel (1) there are lots of soft skills that get honed when you're responsible for your team's success, and (2) having that experience helps you empathize with your own manager. Plus think of all of the books that now become more interesting...

RC W6D3 - Finding a success metric

Compilers

In yesterday's post, I described taking a closer look at how a browser works (in particular, HTML/CSS rendering) as well as compilers, but left out discussing the missing piece connecting the two - executing JavaScript.

We can think about JavaScript execution as translating source code to machine code (which the computer understands). This step could be done through the use of an interpreter, which translates and executes line-by-line. This allows for a fast startup time, but can be slower overall since the same translation may be done over and over again. The other option is to use a compiler, which translates all the source code at once prior to execution. This allows for more optimizations but is more complex and thus slower to start.

Modern JavaScript engines like Chrome's V8 combines the 'best of both worlds'. It starts by running the source code through the interpreter, but also has a profiler that identifies parts of the code that are run repeatedly. These are then recompiled into highly-optimized machine code, allowing for impressive speed ups over time. This process is known as just-in-time (JIT) compilation.

When running microbenchmarks, I was surprised (and impressed!) to find JavaScript performance times to be comparable to Rust. What I didn't compare was memory usage, though perhaps you're already familiar with this from using Chrome. Writing high performance JavaScript is very much tied to keeping the JIT happy, which got me looking at compilers.

A discussion of how V8 makes trade-offs - running a single line of code vs iterating through a loop, memory usage in laptops vs mobile - can be found in the Google I/O talk here.

Startups

Andrey and I talked about startups over a fun coffee chat today. We have a tendency as developers to build something cool that we think will translate well into a startup, but in doing so overlook two key skills.

The first is the skill of 'making money'. This involves being mindful about the commercial aspects of the product (potential user base, how to monetize, pricing) as well as the business in general (accounting, payroll, AWS bill). Naturally the idea here is to have money in exceed money out, or at least, be able to pitch VCs that you'll end up in this state.

The second is sales. While it's easy to think of sales as a part of the first skill, it's clearer as a separate skill when you think about how you can sell something that's free. Sales more broadly involves growing the adoption of your product; a large user base helps create a large paying user base.

A useful analogy is to think about being a good developer and being good at interviews. The two skills reinforce each other but involve different training regimes. Keeping the two distinct helps us break down real-life examples into 'case studies' and assess how well they do in each dimension.

What's very interesting is how Andrey uses this framing for open source software - he leans towards building something that's cool but also something easy to explain and easy to demo.

Content: Sam Altman

I enjoyed the New Yorker profile of Sam Altman so much, I still keep the physical copy. I know, I should have kept the one on Jack Dorsey too.


The article has a brief history of YC and how it's changed under Altman, including this great excerpt.
Launching a startup in 2016 is akin to assembling an alt-rock band in 1996 or protesting the Vietnam War in 1971 - an act of youthful rebellion gone conformist. Since 2005, the year Y Combinator began, accelerators have sprung up everywhere to help transform startups from a skein of code into a bona-fide company.
Altman has since moved to leading OpenAI. He's true to form on gambling on the next moonshot, as per the advice from his post. The question for myself is, what's my success metric?
It’s useful to focus on adding another zero to whatever you define as your success metric - money, status, impact on the world, or whatever. I am willing to take as much time as needed between projects to find my next thing. But I always want it to be a project that, if successful, will make the rest of my career look like a footnote.

RC W6D2 - How the browser works

Compilers

I felt a little out of depth trying to follow along more closely JavaScript optimizations from last week; I started taking a look at compilers to develop a bit more background.

Compilers was the first class I signed up for at Bradfield. I remember Oz described how fortunate we were as developers to have the ability to work at a higher level (libraries, frameworks) as well as at a lower levels (compilers). In light of that, the syllabus started with the following quote by Ras Bodik.

Take historical note of textile and steel industries: do you want to build machines and tools, or do you want to operate those machines?

Alas I discovered on Day 1 how it's more helpful to cover computer architecture first; I dropped compilers for that course instead. I went on to do all the core CS classes at Bradfield... except compilers. The class at the time had the Dragon book as the main text. This may have switched to Crafting Interpreters by Bob Nystrom (available for free online), which has an excerpt I find particularly endearing.

It’s the book I wish I had when I first started getting into languages, and it’s the book I’ve been writing in my head for nearly a decade.

The book itself is filled with quotes, and why not, let's inspire ourselves with this one by Donald Knuth.
If you find that you’re spending almost all your time on theory, start turning some attention to practical things; it will improve your theories. If you find that you’re spending almost all your time on practice, start turning some attention to theoretical things; it will improve your practice.

How the browser works

Running microbenchmarks led me down the path of JavaScript engines but also browsers in general. The mental model of the browser that I found helpful separates the browser UI, called the chrome, from the browser engine.

Let's split the browser engine into two parts. The first part 'talks to the internet' - starts with the URL, finds the remote server, and gets the desired HTML payload i.e. HTML/CSS/JS. The second part converts the HTML/CSS/JS into what you see and interact with in the browser.

I followed along in Python the series by Matt Brubeck on building a toy HTML/CSS rendering engine. This involves parsing the HTML/CSS text into tree-like objects that the computer understands, which are then processed and rendered visually.

I also came across a post describing how Google created V8 because Google Maps had been running really slowly, and getting Google Maps to run fast helps Google sell more ads. Perhaps. On the flip side, Google Maps is a fantastic product, V8 led to Node, and the need for speed led to network innovations like SPDY and QUIC.

Content: The Refragmentation

No discussion of YC is complete without Paul Graham. My favorite PG essay is a little tangential to the topic of startups; the essay here attempts to explain the increasing fragmentation of society.


As controversial but perhaps easier to dispute is my favorite PG tweet.