Angersock

We should've stopped--but let's keep going and see what happens.

Language Features for Non-programmers

Summary

Schlub is a language designed to avoid or make impossible as many excesses of modern languages as possible while offering features to be more productive to business users. It does this by:

  • Providing data types suitable for engineering, mathematics, and statistical work
  • Enforcing the constraints of dimensional analysis when doing computations
  • Requiring specification of precision for numerical operations in blocks of code
  • Hiding machine-dependent representations wherever feasible
  • Omitting “eval” or similar metaprogramming constructs
  • No implicit conversions of datatypes during assignment
  • Having immutable storage for everything, allowing referential transparency
  • Disallowing rebinding of variable names
  • Support for pattern-matching and destructuring for arrays and associative arrays and composite types
  • Functional constructs encouraged (map, reduce, recursion) over accumulators and for loops
  • Mandatory TCO wherever possible
  • Native set operations
  • Native support for business-relevant types like dates, times, durations, and time intervals
  • Native support for CSV, XML, and JSON
  • Native support for arbitrary precision math
  • Native support for matrix operations
  • Native support for HTTP

This is a work in progress!

Please direct all questions, comments, concerns, and hatemail to angersock at this domain.

2018-06-26 edit: Such a work in progress, I apparently left typos all over the damned place. >:(

Introduction

I had a delightful holiday for Christmas and NYE, albeit one mostly spent out-of-town. On the way back home, while using a “luxury coach” (basically what used to be considered first-class on airplanes before they started going to rubbish), I ran into a fellow traveller I recognized from a local university.

This person was working on a language for “non-programmers”, for folks like business users and data scientists—people that they claimed had no formal CS education but who still wanted to use computers to automate their jobs. We chatted for a bit and ended up disagreeing on some of the finer points of language design, but it did get me thinking.

One of my most deep-seated beliefs about our industry is that we spend far too much time reinventing wheels and so actually pursuing the implementation of a language strikes me as a mistake. But, thinking about the features in the language that would be useful is another matter entirely.

So, this post is going to cover some of what I consider to be the most useful features such a language could have and the reasoning that goes into the existence of said features.

Let’s call the language…I don’t know, “schlub”.

The Problemspace of schlub

Schlub is meant to target users that are motivated and technical but who do not actually care about writing programs properly as part of their workflow.

These are users that are smart/stubborn enough to wrestle with bugs but who are not fluent in a given language or who are deliberately not spending time making “nice” code. If a senior software engineer is a master engineer designing Cheyenne Mountain complex, a schlub user is a grunt digging a fighting position—they know what they want, they accept that doing it is going to suck, but they need it now.

Long-term maintenance is not important to a schlub user, thus we don’t worry about affordances for design-by-contract or libraries or software-in-the-large. Schlub users also really don’t care about CS or related topics: if something runs too slowly, their department head cuts a bigger check to AWS and they buy more or larger instances.

Schlub users live in the real world. They care about math, statistics, and engineering, and so the language has a first-class understanding of units of measure, of time, and of dimensional analysis.

Schlub users do care about detecting simple problems (typos, type conversions, etc.) that can be caught at compile time.

Schlub users do care about interop with legacy data formats. So, JSON, XML, HTML, CSV support are all required. Similarly, they do care about HTML documents, because that is a very real usecase.

Schlub’s users do not care about the representation of their data. They don’t care about bitwidths of integers, they don’t care about precision of mantissa, they sure as hell don’t care about textual encodings.

Overview of Schlub

I’m not going to give a BNF grammar or whatever of the language, but just hit some salient points that I want to see.

A lot of this is stuff that in other languages would be addressed in the standard library, or left to the whims of the compiler/VM engineers. However, I think that if we call them out ahead of time it’ll make life a lot simpler.

Look-and-feel

So, first thing is that the language should look and read mostly like Pascal/Elixir. This is done for two reasons: first, we want a language that isn’t full of curly-braces and things that would frighten off users, and second (and more importantly) we want a language whose syntax encourages the use of single-pass compilation.

We want single-pass compilation mostly to steer us towards “fast” compilation. I’d ideally like such a language to be continually compiled during entry in an IDE, so that users get immediate feedback when they enter something that doesn’t make sense.

It also helps discourage the later growth of the language compiler to support weird optimizations and other stuff.

We’re going to use := for assignment and = for equality testing. Enough is enough.

Metaprogramming

No macros. If the user is smart/motivated enough to use a macro system, they’re motivated enough to use a real programming language.

This does mean that the language is annoying and maybe error-prone to doing some types of bulk-processing in.

However, usually the time taken to pick the perfect macro to clarify a problem (for most users) is going to outweigh the barbaric copy-and-paste-and-edit solution that 99% of other users will have. It also serves to make implementation of the language simpler as well as giving a minor point towards people having to read (God help them!) legacy schlub code.

Similarly, no user-defined libraries. Libraries encourage the use of (always eventually bad) abstractions and hamper optimization. If it’s something that is common across many projects, it should be part of the language. If it’s just weird business logic, it’s more likely that a) the business logic shouldn’t exist, b) the logic should be contained next to the code that uses it so both can be replaced together, or c) the logic is fiddly enough that replacing it in a library will break other functioning programs.

We include no features for eval or compile or similar run-time source code translation. Sufficiently warped users can do clever things with shelling out.

Datatypes

There are a small set of datatypes in the language. Primitives are thus:

  • Functions (because composing functions has been shown to be absurdly useful and can be understood easily by end users)
  • UTF-8 strings (because strings, and because UTF-8 can subsume legacy ASCII records easily)
  • Dates
  • Timestamps
  • Time intervals
  • N-dimensional matrices of arbitrary precision complex numbers with units (no specific integer or floating-point type, scalar types are just 1x1)

In addition, composite datatypes (arrangements of primitive data or other composite data according to certain rules) afforded are:

  • Ordered lists
  • Unordered sets
  • Associative arrays

Note that the reason we support composite types is that most interesting real-world engineering consists of taking one arrangement of data, gathering it into a form for calculation, and scattering it back out to another form more convenient for the next business process/consumer.

Any language not giving affordances for that use case is doomed. Any language that offer interop (as Schlub does) with nested hairy data trees (XML, JSON) is doomed.

A note about typing

Since we support maps and arrays and nesting and first-class functions but also do not support user-defined types, we can’t really be statically typed in any useful sense.

We can however be strongly typed and complain loudly if users try to assign a number to a string, or a map to an array, or whatever else.

A note about storage

All data in schlub is immutable and has referential transparency. This solves a great many issues for automatic generation of concurrent code, and also simplifies the mental model most schlub users have. Schlub does not support rebinding of variable names within the same function scope.

Math

Math operations in Schlub are geared more towards something a user of Fortran, Matlab, or R would be familiar with.

#

All numbers are n-dimensional matrices of complex numbers internally. For the convenience of users, scalar values can be written without using matrix notation (the compiler can take care of that for them).

It’s annoying having a language that has no support for complex numbers (looking at you, JS), or that has weird and shoddy support for them (hi C). Even if the vast majority of cases does not require it, when doing some types of math (or signal processing, or other stuff) it is quite necessary.

A note on representation of numbers

The reason for hiding the machine representation of the numbers from the users is that that is almost always done to do clever bit-twiddling hacks (outside the scope fo the language) or to constrain what users can accomplish by making them aware of the limitations of the machine they’re running on.

Clearly, though, we can’t have users losing their minds (and cycles!) over storing numbers with great precision where it really doesn’t matter. Similarly, there is no hope of doing proper optimization and making use of hardware correctly if there is no way of saying “okay, this fits in 32 bits, go do the needful”.

The approach that schlub takes to solving this problem is to instead require the user to specify the precision of mathematical operations they desire (in terms of either significant figures or decimal precision. This allows a sufficiently advanced compiler (haaaah) to select the most efficient representation and operations for implementing a given operation, as well as offering up the opportunity to give schlub users basic information about the precision of their calculations.

This would also perhaps help curb the troubling tendency of people to go and trust blindly long strings of digits that give a false sense of quality of calculations—but that’s perhaps just optimism on my part!

Functional programming in schlub

Things like for-loops are holdovers from having to manually increment pointers, and since we don’t expose pointers we don’t expose for-loops.

Additionally, for the layperson instructions like “eat everything on the table” (in Javascript, something like table.everything().map(eat)) or “slice every fruit on the table” (table.everything().filter(isOrange).map(slice)) make quite a bit of sense if explained in terms of sets of things and operations on those sets.

So, schlub supports the usual things like map, reduce, filter, and functions-as-arguments. We also support the extremely useful idea of sets and set operations, so things like unions and intersections (commonly used) and cartesian products.

The other nice thing is that, by using functional constructs, we allow the user to instrument their programs in such a way as to allow easy scaling for certain types of tasks. If a thing runs very slowly, forking things off into different threads (or machines, in a hosted environment!) as part of the internals of a map or reduce call seems quite attractive.

Weak points and existential problems with schlub

As fun as this is, there are a few issues that I see with the language.

First, if the standard library and language is not continuously updated, it loses its usefulness. Or, worse still, users create transpilers to convert their favorite dialect of Schlub into canonical schlub (adding things like, say, libraries). This kills the portability of Schlub.

Second, the target audience for this language might not actually exist. The folks that need to do those sorts of operations probably already have basic proficiency in Numpy (or perl, or tcl, or R) and lack the desire to switch.

Third, the approach to software writing might be too pessimistic. Maybe requiring copy-and-pasting of code for each iteration of a project is actually a bad idea. Maybe maintenance and software-in-the-large is actually a valid concern.

Fourth, schlub is meant to write basic transform programs, or to do batch processing. Writing a GUI in it would be miserable, writing a compiler excruciating. There are problem domains like this that schlub is awful at (by design), and I don’t know if those domains are actually as important as I think they are.

Fifth, the wink-and-nod of schlub is that, behind the scenes, our focus on business things and not on CS/SE-relevant things (say, by hiding the machine representation of numbers or by requiring a particular form of FP) is rewarded by letting us do clever optimizations and automagical distribution of tasks and things. It is extremely naive to assume that that bet actually pays off—“sufficiently-advanced compilers” are almost always a pipe dream.

In spite of these issues, I still think the concept is an interesting one, and I’d appreciate email feedback from folks who’d like to spitball with me on it.