ENGINEERING

Porting 30K Lines of Code from Flow to TypeScript

David Gomes

We recently ported the 30,000 lines of JavaScript code in MemSQL Studio from Flow to TypeScript. In this article, I describe why we ported our codebase, how we carried out the process, and how it has been working out for us.

I’d like to start by saying that my goal with this blog post is not to condemn Flow or usage of Flow. I highly admire the project, and I think that there is enough space in the JavaScript community for both type checkers. At the end of the day, each team should study all their options and pick what’s best for them. I sincerely hope this article helps you with that choice.

Let’s begin by setting some context. Here at MemSQL, we are big fans of statically and strongly typing our JavaScript code. This allows us to avoid many of the problems associated with dynamic and weak typing, such as:

  1. Runtime type errors, due to different parts of the code not agreeing on implicit type contracts.
  2. Wasted developer time, due to the need to write tests for trivial things such as parameter type checking.
  3. Increased bundle size caused by runtime type checking.
  4. The lack of editor/IDE integration for dynamically and weakly typed code, because editor/IDE features such as jump-to-definition, mechanical refactoring, and others work better with statically and strongly typed code.
  5. The inability to write code based on a data model. With statically and strongly typed code, we can base our code on a data model. We can design our data types first, then much of our code basically just “writes itself”.

These are just some of the advantages of static typing. I describe a few more in a recent blog post about Flow. (Note: A previous version of this blog post appeared on the author’s personal blog.)

Starting with Flow

In early 2016, we started using tcomb to ensure some runtime type safety in one of our internal JavaScript projects (disclaimer: I was not a part of that project). While runtime type checking is sometimes useful, it doesn’t doesn’t even begin to scratch the power of static typing [1]. With that in mind, we decided to start using Flow for another project we started in 2016.

At the time, Flow was a great choice because:

  • Flow is backed by Facebook, which has done an amazing job at growing React and the React community. (They also develop React using Flow).
  • We didn’t have to buy into an entirely new ecosystem of JavaScript development. Dropping Babel for tsc (TypeScript compiler) was scary because it wouldn’t give us the flexibility to switch to Flow or another type checker in the future. (Obviously, this has changed since then).
  • We didn’t have to type our entire codebase from the beginning, which allowed us to get a feel for statically typed JavaScript before we went all-in. Rather, we could just type a subset of the files. Nowadays, both Flow and TypeScript allow you to do this.
  • TypeScript, at the time, was lacking some basic features that Flow already supported, such as lookup types, generic parameter defaults, and others.

When we started working on MemSQL Studio in late 2017, we set out to achieve full type coverage across the entire application. (All of it is written in JavaScript, and both the frontend and backend run inside the browser). We decided to continue using Flow for this project, as that’s what we had been successfully using in the past.

However, Babel 7 being released with TypeScript support definitely got my attention. This release meant that adopting TypeScript no longer meant buying into the entire TypeScript ecosystem, and that we could keep using Babel to emit JavaScript. More importantly, this meant that we could actually use TypeScript as a type checker, and not so much as a “language” per se.

Personally, I consider that separating the type checker from the emitter is a more elegant way of achieving static (and strong) typing in JavaScript because:

  1. It’s a good idea to have some separation of concerns between what emits ES5 and what does type checking. This allows for less lock-in around type checkers and it accelerates development speed; if the type checker is slow for whatever reason, your code will still be emitted right away. [2]
  2. Babel has amazing plugins and great features that TypeScript’s emitter doesn’t have. As an example, Babel allows you to specify which browsers you want to support, and it will automatically emit code that is valid on those browsers. This is very complicated to implement, and it makes more sense to only have Babel implement it, instead of duplicating this effort in two different projects.
  3. Except for the lack of static typing, I like JavaScript as a programming language. I trust that ECMAScript will be around for a good long while, whereas I have no idea how long TypeScript will be around for. For this reason, I prefer to keep writing and “thinking” in JavaScript. (Note that I always say “using Flow” or “using TypeScript”, instead of “in Flow” or “in TypeScript”, because I always think about these two projects as tools and not as full programming languages).

There are some downsides to this approach, of course:

  • The TypeScript compiler could theoretically perform bundle optimizations based on types, and you are missing on that by having a separate emitter and type checker.
  • Project configuration becomes a bit more complicated when you have more tools and development dependencies. I think this is a weaker argument than most people make of it, because having both Babel + Flow was never a source of configuration issues in our projects; we expect this to continue with the move to TypeScript.

Investigating TypeScript as an Alternative to Flow

I had been noticing a growing interest in TypeScript in both online and local JavaScript communities. As such, when I first found out that Babel 7 supported TypeScript, I started investigating a potential move away from Flow. On top of the growing interest in TypeScript, we had encountered various frustrations with Flow:

  1. Lower quality editor/IDE integrations (when compared to TypeScript). Nuclide (Facebook’s own IDE, which had the best Flow integration) being deprecated did not help.
  2. Smaller community [3] and therefore fewer and overall lower quality type definitions for various libraries (more on this later).
  3. Lack of a public roadmap, and little interaction between the Flow team at Facebook and the community. You can read this comment by a Facebook employee for some more details.
  4. High memory consumption and frequent memory leaks — various engineers in our team have experienced Flow taking up almost 10 gigabytes of RAM every now and then.

Of course, we also had to research whether TypeScript was sufficient for us. This was very complicated, but it involved a thorough reading of the documentation that helped us figure out that every feature in Flow has an equivalent in TypeScript. I then investigated the TypeScript public roadmap and was delighted with the features that lay ahead (e.g. partial type argument inference, which is a feature we used in Flow).

Porting 30K+ Lines of Code from Flow to TypeScript

The first step to actually porting all of our code from using Flow to TypeScript was to upgrade Babel from 6 to 7. This was somewhat straightforward, but it took us around two engineer days, since we decided to also upgrade from Webpack 3 to 4 at the same time. Since we have some legacy dependencies vendored in our source code, this was harder for us than it should be for the vast majority of JavaScript projects.

After upgrading was done, we were able to replace Babel’s Flow preset with the new TypeScript preset, then run the TypeScript compiler for the very first time against our full source code written using Flow. It resulted in 8245 syntax errors. (The tsc CLI doesn’t give you the real errors for the full project until you have 0 syntax errors.)

This number scared us (a lot) at first, but we quickly figured out that most of these errors were related to TypeScript not supporting .js files. After some investigation, I found out that TypeScript files have to end in either “.ts” or “.tsx” (if they have JSX in them). I don’t want to think about whether a new file I’m creating should have a “.ts” or “.tsx” extension, and I think that’s a poor developer experience. For that reason, I just renamed every single to “.tsx”. (Ideally, all of our files would have a “.js” extension like in Flow, but I would also be okay with “.ts”).

After that change, we had around 4000 syntax errors – about half as many. Most of them were related to import type, which can be replaced with just “import” using TypeScript and also sealed object notation in Flow ({||} vs {}). After a couple of quick RegExes, we were down to 414 syntax errors.

These remaining errors had to be manually fixed:

  1. The existential type that we use for partial generic type argument inference had to be replaced with either explicitly naming the various type arguments or using the unknown typeto tell TypeScript that we don’t care about some of the type arguments.
  2. The $Keys type and other Flow advanced types have a different syntax in TypeScript (e.g. $Shape<> corresponds to Partial<> in TypeScript).

After all the syntax errors were fixed, tsc (the TypeScript compiler) finally told us how many real type errors our codebase had — just around 1300. This is when we had to sit down and decide whether it made sense to keep going or not. After all, if it would take us weeks of development time, it could not be worth it to go forward with the port. However, we figured it should take us less than 1 week of a single engineer’s time to perform the port, so we charged ahead.

Note that during the transition, we had to stop other work on this codebase. However, it should be possible to contribute new work in parallel to such a port — but you’ll have to work on top of potentially hundreds of type errors, which will not be an easy feat.

What Were All These Type Errors?

TypeScript and Flow make different assumptions about many different things, which in practice means that they let your JavaScript code do different things. Flow is more strict about some things, and TypeScript is more strict about other things. A full in-depth comparison between the two type checkers would be really long, so in this blog post we’ll just study a few examples.

Note: all the TypeScript playground links in this article assume that all the “strict” settings have been turned on. However, unfortunately, when you share a TypeScript playground link, those settings are not saved in the URL. For this reason, you have to manually set them when you open any TypeScript playground link from this article.

invariant.js

A very common function in our source code is the invariant function. I can’t explain it any better than the documentation does, so I’ll just quote it here:


var invariant = require('invariant');

invariant(someTruthyVal, 'This will not throw');
// No errors

invariant(someFalseyVal, 'This will throw an error with this message');
// Error raised: Invariant Violation: This will throw an error with this message

The idea is very simple — a simple function that will potentially throw an error based on some condition. Let’s see how we could implement it and use it with Flow:


type Maybe = T | void;

function invariant(condition: boolean, message: string) {
  if (!condition) {
    throw new Error(message);
  }
}

function f(x: Maybe, c: number) {
  if (c > 0) {
    invariant(x !== undefined, "When c is positive, x should never be undefined");

    (x + 1); // works because x has been refined to "number"
  }
}

Now let’s run the exact same snippet through TypeScript. As you can see in the link, we get an error from TypeScript, since it can’t figure out that “x” is actually guaranteed to not be undefined on the last line. This is actually a known issue with TypeScript — it can’t perform this type of inference through a function (yet). However, since it’s a very common pattern in our code base, we had to replace every instance of invariant (over 150 of them) with more manual code that just throws an error in-place:


type Maybe = T | void;

function f(x: Maybe, c: number) {
  if (c > 0) {
    if (x === undefined) {
      throw new Error("When c is positive, x should never be undefined");
    }

    (x + 1); // works because x has been refined to "number"
  }
}

This is not as nice as invariant, but it’s not a huge deal either.

$ExpectError vs @ts-ignore

Flow has a very interesting feature that is similar to @ts-ignore except that it will error if the next line is not an error. This is very useful for writing “type tests” which are tests that ensure that our type checker (be it TypeScript or Flow) is finding certain type errors that we want it to find.

Unfortunately, TypeScript does not have this feature, which means that our type tests lost some value. It’s something that I’m looking forward to TypeScript implementing.

General Type Errors and Type Inference

Often times, TypeScript can be more explicit than Flow, as in this example:

type Leaf = {
  host: string;
  port: number;
  type: "LEAF";
};

type Aggregator = {
  host: string;
  port: number;
  type: "AGGREGATOR";
}

type MemsqlNode = Leaf | Aggregator;

function f(leaves: Array, aggregators: Array): Array {
  // The next line errors because you cannot concat aggregators to leaves.
  return leaves.concat(aggregators);
}

Flow infers the type of leaves.concat(aggregators) to be Array, which can then be cast to Array<MemsqlNode>. I think this is a good example of how sometimes Flow can be a little smarter, whereas TypeScript sometimes needs a little bit of help. (We can use a type assertion to help TypeScript in this case, but using type assertions is dangerous and should be done very carefully).

Even though I have no formal proof that allows me to state this, I consider Flow to be quite superior to TypeScript around type inference. I’m very hopeful that TypeScript will get to Flow’s level, seeing as it is very actively developed, and as many recent improvements to TypeScript have been in this exact area.

Throughout many parts of our source code, we had to give TypeScript a bit of help via annotations or type assertions (though we avoided type assertions as much as possible). Let’s look at one more example; we had perhaps over 200 instances of this type of error:


type Player = {
    name: string;
    age: number;
    position: "STRIKER" | "GOALKEEPER",
};

type F = () => Promise>;

const f1: F = () => {
    return Promise.all([
        {
            name: "David Gomes",
            age: 23,
            position: "GOALKEEPER",
        }, {
            name: "Cristiano Ronaldo",
            age: 33,
            position: "STRIKER",
        }
    ]);
};

TypeScript will not let you write this because it can’t let you cast { name: "David Gomes", age: 23, type: "GOALKEEPER" } as an object of type Player (open the Playground link to see the exact error). This is another instance where I consider TypeScript to not be “smart enough” (at least when compared to Flow, which understands this code).

In order to make this work, you have a few options:

  • Assert “STRIKER” as “STRIKER” so that TypeScript understands that the string is a valid enum of type "STRIKER" | "GOALKEEPER".
  • Assert both objects as Player.
  • Or what I consider to be the best solution, just help TypeScript without using any type assertions by writing Promise.all<Player>(...).

Another example is the following (TypeScript), where Flow once again comes out as having better type inference:


type Connection = { id: number };

declare function getConnection(): Connection;

function resolveConnection() {
  return new Promise(resolve => {
    return resolve(getConnection());
  })
}

resolveConnection().then(conn => {
  // TypeScript errors in the next line because it does not understand
  // that conn is of type Connection. We have to manually annotate
  // resolveConnection as Promise.
  (conn.id);
});

A very small but nevertheless interesting example is that Flow types Array<T>.pop() as T, whereas TypeScript considers that it is T | void. This is a point in favor of TypeScript, because it forces you to double check that the item exists (if the array is empty, Array.pop returns undefined). There are some other small examples like this one where TypeScript outshines Flow.

TypeScript Definitions for Third-Party Dependencies

Of course, when writing any JavaScript application, the chances are you’ll have at least a handful of dependencies. These need to be typed, otherwise you’re losing out on much of the power of static type analysis (as explained in the beginning of this article).

Libraries that you import from npm can ship with Flow type definitions, TypeScript type definitions, both of these, or neither. It’s very common that (smaller) libraries don’t ship with either meaning, so that you have to either write your own type definitions for them or grab some from the community. Both the Flow and the TypeScript community have a standard repository of third-party type definitions for JavaScript packages: flow-typed and DefinitelyTyped.

I have to say that we had a much better time with DefinitelyTyped. With flow-typed, we had to use its CLI tool to bring in type definitions for various dependencies into our project. DefinitelyTyped has figured out a way to merge this functionality with npm’s CLI tool by shipping @types/package-name packages in npm’s package repository. This is amazing, and made it much easier to bring in type definitions for our dependencies (jest, react, lodash, and react-redux, just to name a few).

On top of this, I also had a great time contributing to DefinitelyTyped. (But don’t expect the type definitions to be equivalent when porting code from Flow to TypeScript.) I’ve already sent a couple of pull requests (here, here, and here) and all of them were a breeze. Just clone, edit the type definitions, add tests, and send a pull request.

The DefinitelyTyped GitHub bot will tag people who have contributed to the type definitions you edited for reviews. If none of them provide a review in 7 days, a DefinitelyTyped maintainer will review the PR. After getting merged to master, a new version of the dependency’s package is shipped to npm.

For instance, when I first updated the @types/redux-form package, the version 7.4.14 was automatically pushed to npm after it got merged to master. This makes it super easy for us to just update our package.json file to get the new type definitions.

If you can’t wait for the PR to be accepted, you can always override the type definitions that are being used in your project, as I explained in a recent blog post.

Overall, the quality of type definitions in DefinitelyTyped is much better, due to the larger and more thriving community behind TypeScript. In fact, our type coverage increased from 88% to 96% after porting our project from Flow to TypeScript, mostly due to better third-party dependency type definitions that have fewer any types in them.

Linting and Tests

  1. We moved from eslint to tslint (we found it more complicated to get started with eslint for TypeScript, so we just went with tslint).
  2. We are using ts-jest for running our tests that are using TypeScript. Some of our tests are typed, whereas others are untyped. (When it’s too much work to type tests, we save them as .js files.)

What Happened after we Fixed all of our Type Errors?

After one engineer week of work we got down to the very last type error, which we postponed for the short term with @ts-ignore.

After addressing some code review comments and fixing a couple of bugs – unfortunately, we had to change a very small amount of runtime code to fix some logic that TypeScript could not understand – the PR landed, and we have been using TypeScript since then. (And yes, we fixed the final @ts-ignore in a followup PR).

Apart from the editor integration, working with TypeScript has been very similar to working with Flow. The performance of Flow’s server is slightly faster, but this doesn’t turn out to be a huge problem, since they are equally fast at giving you inline errors for the file you’re currently looking at.

The only performance difference is that TypeScript takes a little bit longer (~0.5 to 1 second) to tell you whether there are any new errors in your project after you save a file. The server startup time is about the same (~2 minutes), but that doesn’t matter as much. So far, we haven’t had any issues with memory consumption, and tsc seems to consistently use around 600 megabytes of RAM.

It may seem that Flow’s type inference makes it much better than TypeScript, but there are two reasons why that isn’t a big deal:

  1. We converted a codebase that was adapted to Flow to TypeScript. This means that we obviously only found things that Flow can express but TypeScript can’t. If the port had been the other way around, I’m sure we would have found things that TypeScript can infer/express better than Flow.
  2. Type inference is important and it helps keep our code less verbose. However, at the end of the day, things like a strong community and availability of type definitions are more important, because weaker type inference can be solved by “handholding” the type checker a bit more.

Code Statistics


$ npm run type-coverage # https://github.com/plantain-00/type-coverage
43330 / 45047 96.19%

$ cloc # ignoring tests and dependencies
--------------------------------------------------------------------------------
Language                      files         blank       comment           code
--------------------------------------------------------------------------------
TypeScript                      330         5179        1405          31463

What’s Next?

We’re not done with improving the static type analysis in our code. We have other projects at MemSQL that will eventually drop Flow in favor of TypeScript (and some JavaScript projects that may start using TypeScript), and we want to make our TypeScript configuration stricter. We currently have “strictNullChecks” turned on, but “noImplicitAny” is still disabled. We’re also going to remove a couple of dangerous type assertions in our code.

I am excited to share other things I learned in my adventures with JavaScript type systems in future blog posts. If there is a topic you would like to see me cover, please let me know!

Footnotes

  1. Combining static typing with runtime type checking sounds like it could be interesting for certain use cases, and io-ts allows for this with tcomb and TypeScript, but I have never tried it. ↩︎
  2. If you’re using tsc by itself with Babel, you can actually configure it to achieve this same behavior. ↩︎
  3. As of now, the DefinitelyTyped repository has 19682 GitHub stars, compared to 3070 in the flow-typed repository. ↩︎
memsql rainbow wave
Live Webinar
See a Demo of MemSQL & Kubernetes