RC W1D4 - Advice for data scientists

Twitter API

It's very convenient how a HTML/CSS/JS project on repl.it is automatically hosted, so there's no need to explicitly set up a back end (Hello World example here, viewable here).

I've been wanting to play around with Twitter's API, and thought setting up the client JavaScript to make the API call keeps things simple. Annoyingly .env files for HTML/CSS/JS projects get exposed and the more interesting parts of the API is behind the auth.

OK so I'll need my own back end. It turns out this is not too complicated. I stitched together a minimal Flask web server that dumps the response in the HTML (project here, viewable here).

Content: Data science

Today I also presented on applying machine learning to payments use cases - this flows nicely to the content of the day.

https://multithreaded.stitchfix.com/blog/2015/03/31/advice-for-data-scientists

I came across this Stitch Fix blog post when I was first interviewing for a data science role. The advice to choose (or perhaps, give considerable weight to) a company on whether data science makes-or-breaks the business is fantastic - if it's any one article I'd recommend on the topic, it's this.

RC W2D1 - Work hard

Twitter API

It’s Open Source Week at RC! Today I attended a few events to kick off the week, and spent time closing the circle on things I started looking at last week.

First, the Twitter API. The API doesn’t have a /bookmarks endpoint, so instead I looked at the /friends endpoint which returns the list of users you follow. The idea is to find new users to follow i.e. a ‘user recommender’. The algorithm I used was:

1. Find all users I follow (let’s call this group A)
2. Find all users these users follow (group B)
3. For every user in group B, count the number of users in group A who follow that user

I follow 291 users, so the maximum number of ‘votes’ each user in group B can get is 291. While this seems simple, there’s a rate limit of 15 API calls every 15 minutes for that endpoint i.e. 291 calls takes about 5 hours. In descending order of votes, the top 3 are:

1. Elon Musk (116 votes)
2. Barack Obama (110)
3. Bill Gates (105)

WebAssembly

The second item I looked at last week was WebAssembly. I can’t help but find it amusing how this optimization looks to shave off milliseconds off a job that takes 5 hours...

The simplest way to run WebAssembly I found was actually out of the browser with Wasmtime (Rust example here, Python here). It probably makes sense to revisit running in the browser once there's more sensible use case, or at least after the events this week.

Content: Wisdom beyond your years

Re: extra content, I absolutely adore this post Sam Altman created when he turned 30. The wisdom here is way beyond his years.

https://blog.samaltman.com/the-days-are-long-but-the-decades-are-short

On work: it’s difficult to do a great job on work you don’t care about.  And it’s hard to be totally happy/fulfilled in life if you don’t like what you do for your work.  Work very hard—a surprising number of people will be offended that you choose to work hard—but not so hard that the rest of your life passes you by.  Aim to be the best in the world at whatever you do professionally.  Even if you miss, you’ll probably end up in a pretty good place.