Programming: Do not settle for a REPL transcript

Transcript for "Do not settle for a REPL" screencast:

https://www.youtube.com/watch?v=ZD8q3fMr5sE

CoderX here.

Today we will be using test-refresh to find influential Twitter friends. The goal is to illustrate why code reload is better than a REPL. Test-refresh watches the file-system, reloads code, and runs tests. Jake McCrary wrote test-refresh while sporting a smart bow-tie. Programming is serious business.

A REPL is a read eval print loop. Forms are read interactively, evaluated, and the result of evaluation is printed to the screen. But code reloading is better than a REPL.

So do not settle for a REPL.

There are three mature code reload workflows in Clojure:

Figwheel. We showed off how cool Figwheel is in previous episodes.
Ring reload middleware. An absolute must for server side development.
Test-refresh. The focus of this episode. Beats REPLs hands down.

They all watch files and reload code. The trigger for code evaluation is when we save a file,

which is usually the right punctuation point in our workflow. Saving files works with any editor with no special plugins or keystrokes. The model is easy to understand and think about.

Before we get coding, I’ll make a quick disclaimer. This is not a testing rant.

This is about using code reload for an interactive workflow. So don’t be surprised when I don’t write tests… I want to compare apples to apples, experimenting from the REPL with experimenting from test-refresh.

To find influential Twitter friends I need to Connect to Twitter. Fetch a list of my friends. For each friend, fetch their friends. And then Pagerank the network.

Let’s start a new project called twitternet. Add twitter-api to our project dependencies

Navigate to apps.twitter.com and click “create new app”. I deleted this twitter app after gathering the data I needed, so these credentials are no longer live. You will need to create your own credentials. Add clj-http as a dependency.

Let’s make some calls by copying the examples. A few print statements allows us to examine the shape of the data. At this point we discover that rate limiting is quite severe, only 1 request per minute is allowed. Let’s be sure to save the output to files so we can use them later.

(ns twitternet.core
  (:require
    [twitter.oauth :as oauth]
    [twitter.api.restful :as rest]))

(def my-creds
  (oauth/make-oauth-creds
    "pfraQVsp9gYM1hxENMVfSfMBQ"
    "uDFD9sEjTQjholVJkItJXyo11Suvjsx4Gk5KKiBMmNQzUDUfo9"))

(defn fetch-friends [id]
  (Thread/sleep 60000)
  (println "Fetching friends" id)
  (doto (:ids (:body (rest/friends-ids
                       :oauth-creds my-creds
                       :params {:id id
                                :count 200
                                :skip_status true
                                :include_user_entities false})))
    (->> (spit (str id ".txt")))))

(defn fetch-user [screen-name]
  (println "Fetching user" screen-name)
  (:body (rest/users-show :oauth-creds my-creds
                          :params {:screen_name screen-name})))

(defn get-network []
  (let [my-id (:id (fetch-user "timothypratley"))
        my-friends (fetch-friends my-id)]
    (into {my-id my-friends}
          (for [friend my-friends]
            [friend (fetch-friends friend)]))))

(spit "network.txt" (pr-str (get-network)))

Ok let’s let this baby roll! I follow over 100 people, so this is going to take nearly 2 hours… All done! We have a file representing my network.

My immediate network consists of people I follow, those people follow people outside my immediate network, but also follow some people that I follow. This person has 3 people following them. This person has 2 people following them. But only I am following this person.

We can rank people based on the number of links. Pagerank uses current ranks to generate new ranks iteratively. Each link is given a weight based upon the current score of the source node, with which to calculate a new score for the target node. This is repeated until the scores settle. The interesting thing about Pagerank is that the quality of inbound links matters. Here yellow is recognized as important because blue links to it.

We will use LeaderboardX to do a Pagerank on our network. First we need to reshape the data into the expected input format. We want to generate lines of lists of people and who they follow. Great, now we can load our file. Oh no, our graph is way too large. There are 17 thousand nodes, so rendering them all is not possible. If we filter out nodes not in my immediate network there are only 15 hundred.

(ns twitternet.munge
  (:require
    [clojure.edn :as edn]
    [clojure.string :as string]))

(defn transform [network]
  (for [[person outs] network]
    (cons person (filter network outs))))

(defn reshape []
  (let [network (edn/read-string
                  (slurp "network.txt"))]
    (spit
      "lxnet.txt"
      (string/join
        \newline
        (cons
          "Person,Endorses"
          (map #(string/join "," %)
               (transform network)))))))

Success! My network looks like a hairball. There are many connections and nodes.

Notice that the top ranked member has fewer links than second place.

The runner up for highest Pagerank in my immediate network is

Bruce Hauman, the author of Figwheel and Devcards.

And the winner is, Shaun LeBron, the author of Parinfer.

Let’s reflect for a moment and compare test-refresh with a REPL.

There is only the code file to edit
There are no special keystrokes to evaluate forms or send them to the REPL
We get instant notification when code fails to compile or execute

I love tmux, emacs, and vim... but when it comes to REPL integration, things get pretty complicated.

It feels productive to have key combos to execute code, run tests, switch buffers, splice in results and all sorts of great stuff.

However I end up spending a lot of time switching window focus, sending code to the REPL, finding tests to run, forgetting to eval my function or file, and generally being busy interacting with the REPL. I make many mistakes, and blame myself for not being able to keep it all straight in my head.

In contrast, when I use test-refresh with any editor, my workflow is very simple. There is my code and there is the log of what happens when it reloads. From this simplicity flows productivity because I can focus on my code. My primary brain function is thinking about the program, not managing my REPL.

Cursive, by Colin Fleming is well suited to this workflow because Cursive’s error detection, documentation and navigation features do not require a REPL.

If you are new to Clojure, I strongly encourage you to stick with your most comfortable editor for as long as possible. Learning a language is hard enough without learning a new editor at the same time. Test-refresh provides fast feedback without the need for any integration.

To set test-refresh up, add it to your lein profile.clj

I highly recommend configuring the “changes only” and “quiet” options.

These options greatly reduce the amount of time and noise per refresh.

Ultra provides nicely formatted diffs when tests fail. To see it in action let’s write a basic test.

(ns twitternet.munge-test
  (:require
    [clojure.test :refer :all]
    [twitternet.munge :as munge]))

(deftest transform-test
  (is (= [[1 2 3]
          [2]
          [3 1 2]]
         (munge/transform {1 [2 3 4]
                           2 [5 6 7]
                           3 [1 2 8]}))))

Ancient will upgrade project dependencies, if all the tests pass. Kibit detects non-idiomatic code. Eastwood detects bad code. And bikeshed detects bad formatting.

Sometimes I don’t want to create a project to experiment with Clojure. Try CLJ is pretty handy for this because there is no startup time. LightTable has an instarepl which shows results inside the file you are currently editing. To see it in action, let’s try answering a StackOverflow question.

Let’s try this code out and see what we get. Hmmm the problem seems to be with the type conversions here. Yeah, they either need to make a true random bigint, or if a restricted domain of random numbers is acceptable do some extra casting.

I occasionally use LightTable like this for throw away code... but for most of my work, I want to keep the code around.

Using test-refresh as a REPL replacement has made my coding workflow more effective. And it has encouraged me to add tests at times I would otherwise felt that was a chore. Next time you are about to lein repl, lein test-refresh instead.

Do not settle for a REPL.

Until next time, keep coding.

Pages

Do not settle for a REPL transcript