W9 - Declaring writing bankruptcy

I had been behind on writing blog posts, and I took notes to write out in full when I find the time. I now declare ‘writing bankruptcy’, but I’d still like to share my notes.

A theme I’ve been coming across recently is “software is eating the world, but now AI is eating software”. It reminded me of this quote.

Technological advance is an inherently iterative process. One does not simply take sand from the beach and produce a Dataprobe. We use crude tools to fashion better tools, and then our better tools to fashion more precise tools, and so on. Each minor refinement is a step in the process, and all of the steps must be taken.

I spoke to a few early-stage startup founders, and I wondered how they thought about building a team around them. Cue Hank Paulson.

There is no perfect leader. Everyone is flawed and their strengths are usually the opposite of a weakness. The key ingredient I found for any of these CEOs that was just essential to success is the team they put around them. I called it "the right people in the right seats". You needed to play to your strength but you need to put people around you who you listen to, who can compensate for your weaknesses. If you didn't these big jobs always uncover your weaknesses.

I had been at law school in the early 2010s, and one summer I met up with a lawyer friend in London. I shared with her my disappointment in not getting an internship in corporate law.

Instead of offering words of comfort, she said something along the lines of “you have a tendency to speak your mind but as lawyers we often need to err on the side of being diplomatic; I’m not sure you missing out on internships is a necessarily a bad thing.”

Now I can tell her, she’s right. Working with software is not without its quirks, but I’ve yet to see someone describe their experience in the vein of the following excerpt.

There’s a saying out here. Every man has three hearts. One in his mouth for the world to know. Another in his chest, just for his friends. And a secret heart buried deep where no one can find it.

W8 - Learning to unlearn

I grew up in Kuala Lumpur. I’ve since had the chance to live in London, New York, Paris and Berlin. I’ve been in San Francisco since 2015.

While SF is the smallest city in the list, being here I’ve ironically discovered the wonder of being away from the city. I spent time in Mendocino this week. It’s amazing to get away from the day-to-day. It’s like moving out of the trenches to see the battlefield.

We transition from school to college to work, often learning skills throughout (at least, ideally). What I’ve also since discovered is the importance of learning to unlearn.

We develop mental models to simplify how things work in reality, except often we retain those models even when they don’t work so well any more. Either the reason we adopted them, or our priorities, or the underlying world itself have changed; yet the models remain. Sometimes they become our sacred cows.

What’s helpful to set the ground in revisiting those mental models is creating more space. How to create space? Different things work for different people. Meditation. Journaling. Family. Friends. Nature.

I’m grateful to have made this latest discovery. Especially at times when the world feels like it’s at an inflection point, it’s reassuring to be surrounded by timelessness.

W7 - Making magic happen

A question I had during an interview was “what does a perfect day look like to you?”.

I responded talking about working with the perfect team, a variation on a post that I had put up on LinkedIn.

The best team I was a part of made meetings fun. I loved hearing what others are working on and which bits are blockers for me, and sharing what I’m working on to learn which bits are blockers for them. Without prompting from managers, team members would regularly share scripts, documentation and resources to help each other level up and move faster.

After the interview I thought about the question a bit more, and followed up by e-mail with a note of thanks and a quick postscript - coming up with a elegant solution after meditating on the problem for a few days (a la Hammock-Driven Development).

Having slept on it, I lean back towards the team. This is one of my favorite excerpts from Sam Altman’s advice when he turned 30.

Go out of your way to be around smart, interesting, ambitious people. Work for them and hire them (in fact, one of the most satisfying parts of work is forging deep relationships with really good people). Try to spend time with people who are either among the best in the world at what they do or extremely promising but totally unknown. It really is true that you become an average of the people you spend the most time with.

It’s tempting to think about my own path on the back of some design, but this desire to be around smart, interesting, ambitious people has been a recurring theme.

I studied math in college partly because I thought I’d get into a better school vs a more practical subject. I went to law school because my friends who did law seem to ‘know how the system works’, making me realize that knowing the grey areas in life is an effective complement to being good with numbers.

Today I work with software. In terms of unique aspects, there is a level of objectivity (it’s pretty clear when your code runs fast) as well as value capture from network effects. The aspect that I like the most can perhaps be best explained through a story.

Cash App came out of a hack week at Square, yet the buyer-facing Cash App now processes more volume than the seller-facing side of the company. There’s no amount of of top-down effort that gets people excited to build something that eventually overtakes the original idea. That happens through getting smart, interesting, ambitious people in the same room and having them believe the have the agency to make magic happen.

W6D5 - Make idea babies

When I interviewing last year, I was keen to move back to fintech. I wanted better context on problems in a specific domain, to match that against new capabilities technology now ‘unlocks’. The example that came to mind was Square helping vendors at farmers market to accept credit card payments, enabled by more accurate ML-based risk models.

Now I realize that domain doesn’t have to be fintech. It’s true, most of the exposure I’ve had has been in payments and lending. That being said, what’s important is being at the interface between the business domain and technology. The same principle, of understanding domain-specific pain points and how to alleviate that with more sophisticated tools, still applies.

Perhaps Justine Musk says this best.

Choose one thing and become a master of it. Choose a second thing and become a master of that. When you become a master of two worlds (say, engineering and business), you can bring them together in a way that will a) introduce hot ideas to each other, so they can have idea sex and make idea babies that no one has seen before and b) create a competitive advantage because you can move between worlds, speak both languages, connect the tribes, mash the elements to spark fresh creative insight until you wake up with the epiphany that changes your life.

I shared her post on the first blog entry of my first time at Recurse Center. It’s a nice way to come full circle.

W6D4 - Rewriting the book of best practices

In a previous role, we had audits to help us to refactor our data pipelines with confidence. While it may make sense to run audits in staging, staging data can very different vs production. This means that the changes get merged based on guardrails that allow staging to run successfully, but end up breaking in production.

To get around this, you run tests in production. Never test in production you say? What’s the use of testing if it doesn’t stop you from breaking things. The following quote is from Erik Bernhardsson.

Lets let the right workflows emerge from what makes teams the most productive, and lets let data workflows stand on their own feet.

The book of best practices for data (perhaps ML too) is still being written. Plus there’s probably a startup idea there somewhere.

W6D3 - Gradient descent can write better code that you

Spending time at Recurse Center between roles is a great opportunity to think a bit deeper about what I’d like to do next. This is usually through chat with fellow RCs in batch, and it’s always fun to hear how others are going about their search.

I didn't do RC this time around, so that ‘refinement’ process has to be done on the go with real interviews. An interesting question I was asked was “what’s special about having machine learning in your system?”.

I thought about my time at Square and what came to mind was this idea of moving from a deterministic system to a probabilistic one. Suppose a seller owes Square $100, and Square initiates a $100 ACH debit to recover those funds. It takes 3 business days before you know if the debit succeeds. Now the next day the seller needs to be credited $50. What do you do?

In the optimistic case, you assume that the debit will succeed so you let the $50 go through. In the pessimistic case, you assume that the debit will fail so you hold the $50 for two business days. Which one is the right call?

The flow with ML is to train a model based on historical data. If the seller looks closer to one where the debit would succeed, then let the $50 go through. If the seller looks closer to one where the debit would fail, then hold the $50. Hence we switch from making a binary decision, to having a threshold that determines how we would act.

Of course, there is also the question of whether our historical data is sufficient for model training, and if it is, the return on investment on extra complexity is worth it.

Andrej Karpathy takes this a step further in his post Software 2.0, jokingly describing it as follows.

Gradient descent can write better code that you. I’m sorry.

This was written in 2017, and perhaps even more pertinent now.

W6D2 - Revisit past projects with tools of the future

For my first data science project, I trained a machine learning model using Lending Club data. The hypothesis is I can use that data to ‘cherry pick’ loans in a way that outperforms the average return. More specifically, all loans within a certain grade pay the same interest rate and if I can stack rank to select the top 5% say, then I beat the average.

What I didn’t quite expect is to later work for a startup that did exactly this.

I digress. The point I’m trying to get to relates to the loan data itself. The model I built mainly used the numerical features like the borrower’s FICO score and income. There was a text blob column which is what the borrower wrote as their intention for the loan proceeds. This text blob became simple features like number of characters, number of words, number of sentences.

With text embeddings, this column can now be a vector and the whole model be trained on both the numerical features and the semantic meaning of the text. It’s one use case of many that language models now unlock.

W6D1 - Be excited

In a post on job searching, Haseeb Qureshi advises candidates to be excited.

Be excited about the company. It’s trite, but it makes a huge difference. Be excited to interview. Be excited to learn about what your interviewer does and the prospect of getting to work with them.

I'm often fascinated how much you learn about companies through the interview process; plus it's often less motivating to do so outside of a job search. This helps me stay excited.

W5D5 - May the challenges I face help my heart of compassion to open

I've been on the job search a few times now, but I still get the emotional swings. I guess it'll never go away, so the only thing you can do is make the experience less painful.

I came across this quote by Jack Kornfield on Tim Ferriss' podcast.

Let suffering teach you compassion because in Tibet, in some of the Tibetan teachings, they actually pray for suffering. They say, "May I be granted enough suffering so that the great heart of compassion will open in me."

It's a helpful reframing, to think of the experience as another rep at the emotional gym.

W5D4 - Those who cannot remember the (recent) past are (also) condemned to repeat it

My notes on what I'm doing daily have become more sparse, note to self to be more verbose.

What is in my notes is reading Vicki Boykis' post "What we don't talk about when we talk about building AI apps". In particular, even though we're moving into this new paradigm of LLMs, we still have to contend with bloated Docker containers. The post describes images with deep learning libraries as large as 10 GB!

This actually reminded me of work I did at Airtable to reduce Docker image size, by

  1. Reordering the Docker file to have instructions that update less frequently higher in the file,
  2. Removing unused Python modules, and
  3. Setting the AWS Codebuild to git deep vs shallow copy

I know, it's nothing fancy. The last one was particularly counter-intuitive, but the AWS rep said they're using a Go git client and apparently that made a difference (it's also unclear the last hack still works). That said, all-in-all a 60% image reduction size.

Modal discusses building a Docker-compatible custom container runner, image builder and filesystem. Nice.