Voice Memos

The past few weeks I've been a busy chap. I've been building a new web app for recording voice memos. I tried to do as much right as I could, using all the modern technologies. As with all projects, not everything went swimmingly...

  • Last updated: 15 May 2015
  • Est. Read Time: 13 mins
  • Tagged: #app, #voice, #web

See Voice Memos

Overview and Highlights

The app is simple enough in its concept: allow users to record voice memos, persist them on the device, and allow for offline access. The idea is, in many ways, to show that there is a whole class of applications that the web can deliver perfectly well, and across platforms at that!

Here’s what I baked in:

  • Service Worker and IndexedDB for offline. No good having an app that can work offline but doesn’t. Just as well we have amazing new tech to make that a reality, isn’t it?
  • ES6 classes, fat arrow functions, and Promises. I went all-out crazy to try and use tech that I felt were on the leading edge of what we want to use as developers. I used Babelify (more on that in a bit) to transpile to ES5.
  • High performance. I really tried to make loading and render performance super high priorities in the build. Not always the easiest, but it’s certainly possible given some care. I don’t have access to all the hardware and software configurations out there, but I’m confident that there’s not much more I could have optimized in the app at this stage.
  • Material Design and Responsive. It works on mobile through desktop with three breakpoints on small, medium, and large screens. As usual I set a breakpoint when I felt the content looked broken. (Partly why I feel it’s healthy for developers to understand design as much as for designers to understand development.)
  • Open source. Of course it is! \o/ Go and get the code and have a look around!

Browser Support

Let me say right from the start that Mobile Safari isn’t supported… yet. Not because I don’t like it as a browser. Far from it, in fact. My two day-to-day devices are a Nexus 5 and an iPhone 5S, so I’m quite down with Safari as a consumer. But, and this is the key issue, iOS doesn’t support getUserMedia, which I need to be able to pipe microphone audio to the Web Audio API. Without it there’s no realistic means to live-capture audio. There are weird hacks one can use to record a video and then attempt to rip out the audio, but … well, no. No.

In any case iOS already has a voice memos app, so, you know, there’s that.


Since we’re talking broken, let’s also talk about codecs. My biggest frustration was the state of media today. What I wanted to do was to create encoded files to store in IndexedDB. By default the buffers you get from the Web Audio API are uncompressed, bit like getting a bitmap image. So I looked into what I could do about it. Turns out there are some excellent people who have Emscriptened things like the LAME MP3 encoder, and an Ogg Opus one (I’d never heard of the Opus format before this!) to boot.

Logo: opus-codec.org

The LAME MP3 encoder was 296KB, which I just couldn’t justify downloading on every visit. And, for that matter, MP3 encoders seem to be mired in patents stuff, which I don’t fully understand, and therefore want to avoid. Meanwhile, the Ogg Opus one… Well, that was just over 500KB, but, hey, better than storing WAVs on peoples’ devices, or getting into patent problems, I guess.

Long story, short: the app has to store audio as WAVs, so don’t record long memos!

It turns out that Chrome supports Ogg Opus on desktop through the <audio> tag, or new Audio(), but not through the Web Audio API’s decodeAudioData function. (I totally filed a bug on that.) On mobile Chrome only supports Opus when it’s in a Matroska container… Honestly, I have never heard of .mkv files, either.

In any case, there’s hope in the form of the MediaStreamRecorder API, though I don’t know if actually encodes anything in that process. Certainly the spec just says it returns a blob, so goodness knows. Could be a WAV blob, for all I know.

Long story, short: the app has to store audio as WAVs, so don’t record long memos! I’m sorry about that, but there’s nothing I can do about this just yet. If the vendors come together and support a common codec (or group of them) everywhere, I’ll be right at the front cheering them on.

Making waves

Making waaaaves!

When you record a memo, you may notice that there is an audio wave pattern in the background. It is actually of the audio itself. Initially I had thought I’d do this on decoding of the audio data, but it became clear that I wasn’t going to be using the Web Audio API beyond the recording, simply because there was going to be no audio to decode, and so an <audio> element would do just fine.

What I ended up doing was adding an analyser node to Recorderjs’s input stream, and taking samples throughout the recording. These were normalized and stored alongside the audio data. When shown it gets rendered using a canvas element.

var listener = recorder.audioContext.createAnalyser();

// Add the listener

// And watch for as long as we are recording.

Doing it this way also meant I could give people a live readout as they were recording.

People get live feedback as they record memos.

Overall it’s a little touch, but I kind of felt it was more fun than just a solid block of purple. And for me it showed just how powerful these APIs can be.

Standing on the shoulders of giants

Goodness me there were lots of amazing things out there for me to use, and I think it would be wrong of me to pretend like I did this on my own, somehow. Here’s some of the things I used (in no particular order):

  • Gulp plugins galore. I ended up making a couple of versions of the gulp file, because it was the first time I used it and I screwed up a bunch of things. But man has the Gulp community been busy. Everything from bumping version files to uglifying JS, to licensing. All covered!
  • ES6 transpilers. There are a few of these on the go, but I opted for Babel(ify), since it seemed to have solid gulp support through watchify. I was particularly impressed with the source maps support, because while it did add it as a data-uri to the end of the files (which is fine for dev time anyway), it never got it wrong and it made debugging a breeze!
  • Material Design guidelines. Sure it was going to be the default look and feel for someone who works at Google, but as a system it does work. You could definitely use a different baseline grid (it seems to be 8dp multiples for Material Design), and different colours, but its value for me is in the UX. It wants you to focus on directing the user to the task. And for me, in following its guidance, it was simple enough to make something effective.
  • Recorderjs and Moment.js. Both of these are life-savers, particularly Moment. Nobody wants to wrangle lower level APIs or fiddly things like dates and times, and both of these micro-libraries helped me avoid that. I actually used a fork of Recorderjs, just because it had the Opus codec.

I’m thankful to people in the community who’ve put so much time and effort into building things others can use.

This, for me, is a large part of the web. When it works, and there are solid tools and clear guidance, everyone wins. I know I certainly did, and I’m thankful to people in the community who’ve put so much time and effort into building things others can use.

ES6 funsies

So I had this idea: what about a model abstraction layer that could persist to IndexedDB under the hood. Sounds good, right? I thought so, too, and I had in mind that I could use ES6 classes for it. So I’d have a Model class with a bunch of static methods for CRUD (Create, Retrieve, Update and Delete) operations, and from which other models could inherit:

class Model {

  static get (id) {
    // Get data from IDB.
    return data;

  // ... and the same for put and delete.

class MemoModel extends Model {
  // Epic stuff goes here.

// Get a Memo from the DB.

But there’s this problem, you see: IndexedDB removes the prototype (which is what ES6 Classes are really) when it stores the object. On retrieval, you don’t get a MemoModel back you get an Object, which isn’t so splendid.

I noodled around, and talked it through with the team around me about this a fair amount. I was effectively looking for a factory pattern here. Model needed to know which class was calling – say – get and then it could wrap the object in the correct instance and away we go.

Here’s the solution for my conundrum:

class Model {
  static get (id) {
    // Get data from IDB...
    return new this(data);

Perhaps it looks weird to you. It certainly did to me at first, but what it does is quite neat: if you call it from a class that inherits from Model it will create a new this, which will be the inherited class, not Model.

I’d made sure that objects were the means by which models were populated, so that I could just pass an object to their constructors. That way I got back fully-fledged MemoModel rather than just an Object without having to hard code getters and putters on every model. Hurray!

Other things that are cool: fat arrow functions. Those generally let me write nicely anonymous callbacks without having to worry about naming and binding, for which I was very grateful!

class Foo () {
  constructor () {
    this.bar = 'baz';

    requestAnimationFrame ( () => {
        // This works!

Normally that above example wouldn’t work, because callbacks for requestAnimationFrame, setTimeout, setInterval (and probably others) are always scoped to global, which never fails to catch me out. Well no more, you hear! NO MORE.

The challenges of modern web

I think on balance I came down to four or so particular frustrations during this build, though I guess given long enough to think things through there may be more:

  1. The codecs issue. It hurts developers, and therefore users that we don’t have a good story here.
  2. The lack of API ubiquity. I hated writing the code for Mobile Safari (and IE for that matter) that basically said “you’re a no-go here”, because I desperately wanted to be able to say it works everywhere. But it’s not functionality I can realistically polyfill. Maybe I should’ve picked a different app. Mind you, I think where those APIs do exist the app itself is pretty neat.
  3. View setup and takedown. I’ve said it before, and I guess I’ll say it again: we have a platform that is amazing for documents. However, when you do something that has views, like an app, then you spend a lot of code writing boilerplate for managing those views. And, of course, with large numbers of DOM elements, you have potential performance issues of style calculations and layout. By contrast, Android and iOS’s base state is that the platform manages view lifecycles for you. That said, they’re no good at documents, are they? Well, if we can crack app structures on the web I reckon we’ll be flying.
  4. Performance. You didn’t think you’d get away without me mentioning it a little bit, did you? Actually in the main I think things are really looking up. I was careful to follow Jake’s advice of progressively loading the app, and from there I was my usual self about avoiding render perf issues. But even so I just felt like again I spent a lot of time wrangling for performance without platform-level primitives to help me. I’d have loved component loaders that understood that there is a bundle of HTML, JavaScipt and CSS that belongs in a particular place in the app, and that it knew when to load them, and how to hold other parts of the app off if they needed access to those components. I know HTML Imports is in that rough camp, and ES6 also has module loaders, but I’d love to see something more “component aware” at the platform level.


Wow, this ended up being a much longer post than I anticipated. If you’ve made it this far, I hope you’ve enjoyed the read.

I’ve tried to build a high-quality app with genuine usefulness, and I hope that comes over. I’ve given every aspect a lot of thought, and not everything worked out as planned. A lot of it did, though!

If you can, give the app a try, and let me know what you think. Have a look at the source code, too!