Month: August 2019

The challenge of C/C++firmware libraries

The challenge of C/C++firmware libraries

One of the statements about MetalScript that people seemed to disagree with is the idea that it can take as long as a week to integrate a C/C++ library into a firmware project.

TL;DR I’ve spent many years professionally doing both JavaScript and C/C++ firmware. In my experience, integrating and becoming familiar with a C/++ firmware library can take often days or in some cases weeks in some cases, while a JavaScript library often takes minutes to use.

Let me first say that if you are an expert in C and/or C++ firmware and have chosen not to use JavaScript for any major projects, then you may not be the person who will want to use MetalScript. You have spent years mastering a complicated craft, and although you have probably seen or worked with people who use JavaScript, you’ve chosen to stick with C/C++ because you probably believe it’s better. You may even look down on people who program in JavaScript — “real men” know how to do their own memory management, avoid signed integer overflow, and use CRTP to write code that is both well-structured and performant at the same time.

If that describes you, then keep doing what you’re doing. MetalScript is not for you, it’s for people who love JavaScript and who want to write real firmware with it.

Let me also say this:

I am a C/C++ firmware developer, and have been programming firmware for the last 20 or so years.

Up until about 5 years ago, my impression was that JavaScript was a pretty poor choice of language for various reasons, until I actually learned it and used it in real projects. Now that I’ve actually used it, I’m converted to its merits1. I am qualified to compare C/C++ against JavaScript because I am proficient in both and have used both in many real-world projects.

The best thing about JavaScript is not the language itself. The language is good — it used to be pretty bad, but with ES6 and modern features, it is becoming a really good language to work in. But the thing that makes the JavaScript experience great is npm.

The npm package repository contains hundreds of thousands of packages for JavaScript. In addition to the packages themselves, there is a culture that drives useful conventions, such as:

  • Packages generally have their source code in GitHub
  • Packages generally have a readme file in the root written in markdown. And because everyone does it this way, npm and GitHub both display the readme on the main page for the package
  • The readme typically contains a brief description of what the package does, as well as how to install it (even though the installation process is almost always the same)
  • The readme often contains a set of examples to get you started
  • The readme often contains a set of options for advanced usage, or links to proper API documentation.

Not all packages will be this well presented, but the vast majority are, and it’s hard to understate the importance of this conformity. It means that finding and getting started with a completely new library can happen within just a few minutes. The relevant information is all upfront, and the examples are typically self-contained so they can be pasted right into your code and they “just work”.

To try to demonstrate my point, I’m going to compare some examples. It’s difficult to come up with fair examples because firmware libraries are not going to be typically found in npm. So I will do this with two different examples — one of them using actual libraries and one of them a made-up scenario.

Example 1: Calculating a CRC

For the first example, I’m going to try to calculate a CRC in both C++ and JavaScript. This is something that is well suited to a third-party library, and so I expect to find code that already does it in both JavaScript and C++.

This comparison will probably be the best possible case for a C/C++ library. A CRC can be calculated without any platform dependence or customization. It should be as simple as finding a function online that does it and pasting it into the code. Let’s see how we go.

JavaScript

I’ll start with JavaScript, and will then compare the experience in C/C++.

  • At 10:22 AM, I Google “npm crc calculation” (Note: I recorded these timestamps while doing it, but not while blogging, so as to minimize interference)
  • Look at first result – a package called crc on NPM — open the page
  • The first thing on the front page: the list of features for the library. Yes, this looks like what I want
  • The second thing on the front page: the command to install it. At 10:23 AM I run the command npm install crc in my project folder. I have gone from “thinking that a library might exist”, to successfully installing it, in about 1 minute.
  • The third thing on the front page: the example code to use it. At 10:23 AM (still), I create a test script with two lines of example code:
const crc = require('crc');
 
console.log(crc.crc32('hello').toString(16));

At 10:24 AM, I run the code – node test.js . It works.

But actually I didn’t want CRC-32, I wanted CCITT-16. I adjust the test code to crc.crc16ccitt(‘hello’) and it still works.

Finished by 10:25 AM — from imagining some functionality to having it integrated in 3 minutes. This is not unusual IMO, once you are familiar with the workflow and know where to expect everything by common convention.

In C

At 10:37 AM, the first thing I’m going to do is Google “CRC calculation in C”. There is no standard repository that I can search, so I’m open to anything on the internet.

I look at the first page of links. None of them jump out at me as what I’m looking for.

I look at the first link. Scrolling through it, I can see some diagrams, some code, and lots of writing. Perhaps if I want to understand CRCs, this is not a bad place to be. But really I would prefer it if someone else understood CRCs, and I just leverage their expertise.

Should I just copy-paste one of the example pieces of code? I read skim-read pieces of the document to try to get an idea of whether this is a bad idea or not. The fact that it says “bit by bit” as one of the headings makes me think that it’s leading the reader through the implementation and starting with a less-than-ideal implementation. Better not use that one.

What about the other code snippet they include in the article? It’s not clear what kind of CRC this is for. Should I read the article? Should I cut my losses and move to the next link? Should I copy-paste and hope this is the right one? Time is ticking, and this is a race.

10:41 AM. I cut my losses on this page and move to the next search result. It’s got code — that’s good. But again, it’s got a “simple example” and an “advanced example” — not examples of usage, but examples of CRC functions. What does simple and advanced mean? Does one do more stuff than the other? Is one more efficient than the other? Do I have to read the code to find out? Time is ticking.

Glancing through the code (now the 3rd and 4th pieces of somebody else’s code that I’ve had to look at), I see that the simple example doesn’t use a lookup table, and the advanced one does. Likely I’m on another educational page that’s trying to teach the reader about how to do CRCs.

Why is this kind of thing the first two search results? Surely people are more commonly wanting to use CRC code than to write and understand their own implementation? Does it say something about the culture of C that the top links on google for “C CRC calculation” are to help people to write their own implementation from scratch?

Maybe my search terms are poor. Maybe I should have used the term “ccitt16” in the search query. Maybe some other changes would also help? I remember that the next two search results are stack overflow questions — let me have a quick look at them before I go back to try other search terms.

10:45 AM. Third search result. This is a stack overflow question. He says:

[Bla bla bla] I’ve created a function to calculate a CRC16 checksum, but it doesn’t seem to be outputting correct values, [bla bla bla]

(I’m skim reading because I’m in a race against the JavaScript guy who integrated a working library in a quarter of the time it’s taken me not to get anywhere)

The guy wants to fix his function. The top answer has a bunch of explanation that I don’t have time to read, and then some code that is prefaced with “so, your function might look like”. Those are not words that inspire confidence in me. It sounds like his goal is to help the questioner figure out where he went wrong, rather than writing production-quality code that many other people will depend on.

Should I look at the other answers? Should I abandon these search terms and try something else? Should I look at the other SO question?

Let me have a quick glance at the other SO question before deciding.

10:48 AM. Fourth search result. A stack overflow question. This guy says:
Since CRC is so widely used, I’m surprised by having a hard time finding CRC implementations in C. [bla bla bla]

Totally agree with ya bro.

The top answer provides a bunch of links. A lot more reading, and a lot more implementations to choose from. But now at least we’re getting somewhere.

I actually landed up picking the implementation in the second answer of the first SO question. Not because it was carefully considered as the best choice, but because I was in a rush and it had a couple of nice properties at a glance:

  • It was short, so I felt less intimidated
  • The answer was only prefaced with 2 sentences, so there wasn’t much reading for me to do
  • In one sentence I see the words crc16 CCITT
  • In the other sentence, I see the word “tested” and a link (the link makes it official! ?)

10:50 AM. I paste the code into a C file and write a main function to test it. 

int main() {
  const char* str = "hello";
  short crc = crc16(str, strlen(str))
  printf("%04x", crc)
}

10:53 AM. I try to compile it, but GCC is not in my environment path. This has nothing to do with the library, so let’s just pretend I compiled it and it worked.

It took me 15 minutes, as opposed to JavaScript’s 3 minutes.

Postmortem

Speed to find library

Most of the time spent in C was spent finding the library. While it’s true that this will generally be slower in C than in JS (since there is no common convention and central catalog of such libraries), finding the library will never take a whole week, so this small example doesn’t account for the majority of the time that I claimed it takes to get a C firmware library integrated.

Why is that? What’s special about the CRC example that makes it unrepresentative of the norm?

I think the answer is that if you are picking an example from the subset of libraries that work both in today’s JavaScript world (i.e. it will be intended for server or browser) and also the world of C/C++ firmware, you are actually left with a small collection of libraries which do not exhibit most of the complexities that arise in firmware, which biases the comparison. As I mentioned earlier, a CRC-calculating function is much easier to make platform-agnostic, and so should be the best possible candidate for a hassle-free library in C.

Speed to install/integrate

I’d say that both the C and JS versions in this example were pretty similar to integrate into a test script (representative of a larger application). However, from experience, I’d say that almost every npm library is just as easy to integrate as this CRC library, typically only taking a few minutes to get going with the basic examples. In JS, I believe this example is representative of the general experience.

In C however, we’ve picked an example that is trivial to integrate — just copy and paste.

Most C/C++ firmware libraries are integrated at the source code level because of the wide range of possible target architectures, and the source code often requires extensive customization or dependency implementation or port layers in order to get it to work for your particular setup (and reading through documentation to understand how to do that). To compound the issue, the reality is that many firmware compilers don’t support the full C/C++ spec, normally for performance or architecture reasons. These are not criticisms of C/C++ per se, but nevertheless, are part of the typical experience of using C/C++ in a firmware environment.

Confidence

How confident am I in the JS vs C library?

The C “library”, if we can call it that, is some code in a Stack Overflow answer that one guy wrote. His testing involves running it a few times and checking that it matched some online web page. I don’t feel great about that. Maybe it’s okay because other people reading it may have spotted the problem and put in a comment if there was something wrong with it.

On the other hand:

  • The npm crc package has been downloaded over a million times in the last 7 days. All those feet treading on the same path will harden that path. If there any bugs, they will be found quickly.
  • It comes standard with a suite unit tests, and both the GitHub and npm pages display that all the tests are passing.

Furthermore, the package manager allows me to quickly update my dependencies, to make sure that I get the latest bug fixes at any time.

Documentation and Ease of Use

I think this answers itself. There is no documentation and no example code with the C version — it is a code snippet in a SO answer, so what do you expect?

The JS version has exactly the documentation that you’d expect from most npm libraries — it is concise and describes the key things you need to know in order to use it. It doesn’t try to tell you the theory behind CRCs, or anything that isn’t directly relevant to being productive as quickly as possible.

Am I talking about the language or the package manager?

I’d like to just clarify something because I know this is going to be brought up. I say that I’m comparing C/C++ vs JavaScript but then go on talking about things that are not part of the language at all (Google, Stack Overflow, npm, culture and ecosystem). Is that valid?

Yes, I think this is valid. When you choose to develop in JavaScript or C/C++, you’re not just adopting a language. You’re adopting all the tools, community, and culture surrounding the language. Productivity is affected by all of these factors, and they come together as a whole. You can say all you want about how you think C++ is a better language if you think that, but at the end of the day, it’s about getting shit done, and the JavaScript “whole” is better for that then the C++ “whole”.

Example 2: A modem driver

Here, I’m picking an example that adds complexity more typical of firmware development, but the tradeoff is that this example isn’t real — it’s merely a vision I have for the future. I don’t think the current state of JavaScript firmware development is mature enough for this to be a reality today.

For this example, I will assume the following hypothetical scenario:

  • We have a product that has an MCU and a u-blox cellular modem
  • Due to a shortage of UARTs on the MCU, the product connects the modem to a UART extender
  • Objective: connect to the internet to send an HTTP POST, receive the response JSON, decode, and output to the message therein to the debug UART

Before I even start, if you are a firmware programmer, give a moment to think about how you would do this. If I contracted you to write firmware for a device that does this, how long would it take you to write?

The Vision

There are two domain-specific pieces of information that we absolutely need to specify somewhere in any firmware, no matter what language:

  1. Information describing the behavior we require, such as the fact that at startup we want to connect to the internet, POST a message, and print out the response
  2. Information describing the device configuration, such as the fact that we have a u-blox modem, and the fact that it is connected on the multiplexer, etc.

We can summarize the required behavior with the following hypothetical JavaScript code:

// app.js
import * as request from 'request-promise-native'; // third-party library to perform HTTP requests

export async function run(device) {
  await device.modem.connectToInternet();
  const reply = await request({
    url: 'http://my-service.com/test-url',
    method: 'POST',
    json: 'please give me a message to display'
  });
  console.log(reply);
}

We can summarize the required device configuration with the following hypothetical JavaScript code:

// device.js
import { UBloxModem } from 'ublox';
import { Max14830 } from 'max-14830'; // UART extender driver

export const uartExtenderI2C = mcu.i2c('G7');
export const uartExtender = new Max14830(uartExtenderI2C);
export const modemUart = uartExtender.uart(3);
export const modem = new UBloxModem(modemUart, 'LISA-U200');
export const debugUart = mcu.uart(2);
export const debugConsole = new UartConsole(debugUart, { baud: 115200 });

Then we also need some glue code:

// main.js
import * as device from './device';
import { run } from './app';

// The console we want to use for output messages
global.Console = device.debugConsole;

// The device to use for connecting to the internet
global.internet = device.modem.internet();

// Transition from compile time to runtime
mcu.start();

run();

Perhaps the reality won’t be so easy, and I’m oversimplifying it. But I can imagine getting a library like this off npm and being able to get working with it on a firmware device within a few a hours.

How long would it take to do the same thing in embedded C? Days, weeks, months?

Conclusion

In between writing this article and publishing it, I ran into another real-world example. I needed a modbus connection from my C firmware to my electron JavaScript application, and I was implementing both sides. The JavaScript side was working within an hour, as one would expect. The C side took days of implementing hundreds of lines of porting layer, managing states, and banging my head against the wall.

The reality is that JavaScript is simply a much more productive tool to use, and a large part of that is because of how easy it is to reuse third-party code and to share your own so that others can reuse. 


  1. Although if you want to use it for a real project these days, please use TypeScript so you can get static type checking