Saturday, September 27, 2014

GHCID - a new GHCi based IDE (ish)

Summary: I've just released ghcid, which interactively shows the errors in your project on each save.

I'm please to announce ghcid, which is either "GHCi as a daemon" or "GHC + a bit of an IDE". I've been using it for my development for the last few days, and already find it quite valuable. Unlike other Haskell development tools, ghcid is intended to be incredibly simple. In particular, it doesn't integrate with any editors, doesn't depend on GHC the library and doesn't start web servers. It's under 200 lines of fairly dull Haskell, which talks over pipes to your existing ghci.

Using it

Run cabal update && cabal install ghcid to install it as normal. Then run ghcid --height=10 "--command=ghci Main.hs". The height is the number of lines you are going to resize your console window to (defaults to 8), and the command is how you start this project in ghci. Personally, I always create a .ghci file at the root of all my projects, which usually reads something like:

:set -fwarn-unused-binds -fwarn-unused-imports
:set -isrc
:load Main

With that you can pass --command=ghci (or nothing, since that is the default).

After that, resize your console and make it so you can see it while working in your editor. On Windows the ghcid console will automatically sit on top of all other windows. On Linux, you probably want to use your window manager to make it topmost or use a tiling window manager.

What you get

On every save you'll see a list of the errors and warnings in your project. It uses a single ghci under the hood, so even relatively large projects should update their status pretty quickly. As an example:

Main.hs:23:10:
    Not in scope: `verbosit'
    Perhaps you meant `verbosity' (imported from System.Console.CmdArgs)
Util.hs:18:1: Warning: Defined but not used: `foo'

Or, if everything is good, you see:

All good

This project is only a few days old, so please report any bugs you find.

What you want

I regularly use an IDE to develop in a Haskell-like language. I find that with the IDE I'm about 25% more productive than without it. While an IDE can provide lots of neat features (go to definition, search, type tooltips) I think most of the productivity gains come from:

  1. Syntax coloring.
  2. A list of errors and warnings...
  3. ...which is updated as you type...
  4. ...and are highlighted in the text (red squiggles).

Every text editor already provides syntax coloring. With ghcid you get the list of errors and warnings. To get the final two features you need to integrate with an editor. I'm hoping that ghcid can do some of the heavy lifting by taking a directory of files to treat as overrides, and producing a list of warnings/errors to a file. The rest is editor specific, and I hope to attempt integration with Sublime Text at some point (although would love some help).

Wednesday, September 24, 2014

Generating Open 3D Viewer Models

Summary: It's not obvious how to generate suitable Open 3D Viewer models, but with the right tools it isn't hard.

The Open 3D Viewer project is very cool, as an example here is a demo of a spinning cow rendered in the browser. For my wife's work I wanted to generate a 3D model of a fossil bedding plane. Effectively, given some x/y/z coordinates, produce a pretty rendering. It took a fair bit of guess work, so I wrote down the steps.

Step 1: Generate an OBJ file

Generate an OBJ file. You probably want an MTL file too, but it seems the 3D viewer only uses the Kd field. There are a few ways to get an OBJ file:

  • There are many samples on the web, including a snail in the Open 3D Viewer repo.
  • You can create OBJ files in a tool such as Blender, but the Blender interface confused me a lot (I am definitely not their intended audience).
  • You can generate an OBJ file using a Haskell script. I picked this method, and I'll write a blog about the script later, once I have some pretty pictures to show.

Step 2: Get the tools

There are some tools in the WebGL Loader project. Alas, that project says "for now I recommend r50 as the last stable revision". So now there are two tools to try, the latest and r50. I tried both. I had some limited success with r50 (it didn't seem to render properly, but it did run) while the latest revision segfaulted. Fortunately I found the tools in a Google Groups post, and have mirrored them in my repo (with trivial tweaks to support Python 2.7).

Step 3: Run objcompress

You need to run:

objcompress mymodel.obj mymodel.utf8 > mymodel.js

This will generate lots of mymodel*.utf8 files and mymodel.js.

Step 3: Run part_grouping.py

You need to run:

py part_grouping.py

(The file in the email is part&grouping.py, but I renamed my copy.) This script will interactively ask a really long list of questions. I generate the correct inputs into a file and pipe it in:

py part_grouping.py < response.txt

This generates the files groupings.txt and part_info.txt.

Step 4: Run make_viewer_metadata.py

Run:

py make_viewer_metadata.py

This generates the file entity_metadata.json.

Step 5: Get the viewer source

You can get the viewer source from the Open 3D Viewer repo. I have mirrored it in my repo, but I may tweak the viewer over time to match my wife's needs - you should get the original.

Step 6: Copy your files

Copy all the files from steps 1 to 4 to a directory inside the viewer named models/mymodel.

Step 7: Update the model list

Open up scripts/models.js and edit it to point at your model. For example:

o3v.MODELS = [{
  name:'mymodel.obj',
  scriptName:'mymodel.js',
  modelPath:'models/mymodel/',
  metadataFile:'entity_metadata.json',
  numLayers:12
}];

Step 8: View the result

You can view the result by opening index.html. In Chrome you may need to pass the flag --allow-file-access-from-files.

Tuesday, September 16, 2014

Towards Shake 1.0

Summary: I've just released a new version of Shake, with a --demo feature and an underlying continuation monad. I want to release v1.0 in the near future.

I've just released a new version of the Shake build system, version 0.13.3. While the version number is only 0.0.1 higher, the changelog lists a large number of improvements. In particular, two changes are:

  • The Action monad is now based on continuations, which allows Shake to suspend threads without requiring a GHC RTS thread. The result is significantly less memory used on thread stacks. I still find it quite amazing that Haskell has such powerful and robust abstraction mechanisms that a huge change doesn't even break the API.
  • The shake binary now features a --demo mode, invoked by running shake --demo. This mode generates a Shake project, compiles it, and shows off some of the features of Shake. You can the output of --demo here.

Version 1.0

With the two features above, I'm now looking towards Shake version 1.0. I'm not looking to make any significant backwards-incompatible change, or necessarily any code/API changes at all. However, if you have anything you think should be addressed before reaching such a milestone, please comment on the issue tracker or email the mailing list.

Shake website

The one thing I still want to finish before releasing version 1.0 is to have a proper website for Shake. I've registered shakebuild.com which will host the content, and have set up GitHub pages to serve it up. I have some rough content in the docs directory and a prototype generator in the website directory - as an example it currently generates something a bit like this for the user manual, but with a table of contents when run through the latest generator. I'd appreciate any help with the content, the generator, or the styling - just email the mailing list.

Sunday, September 07, 2014

Shake in the wild

Summary: I spotted a few things using Shake, which I had nothing to do with.

In the past few days I have come across several things using the Shake build system. I wasn't involved in any of them, and haven't (yet) tried any of them out, but they certainly look cool.

ToolCabal

Tibor Bremer from Utrecht University gave a talk at the Haskell Implementors Workshop 2014 about his ToolCabal project. This project replaces the "build a package" part of Cabal with something more flexible, supporting multiple simultaneous targets and more flexible preprocessors - all built on top of Shake. It doesn't attempt to tackle dependency resolution yet. There is a video of the talk:


Samplecount

The folks at Samplecount have written several Shake based things. None are yet on Hackage, so I suspect they are somewhat prototypes, but they look like they're already used quite seriously.

  • shake-cabal-build to make it easier to build your Shake build systems with Cabal. Shake build systems need to be compiled with GHC, for which I usually use ghc --make, but this project explains how to get things building with Cabal - important if your build system pulls in other libraries.
  • shake-language-c is a project to simplify building C/C++ projects with Shake. From the docs:

shake-language-c is a cross-platform build system based on the Shake Haskell library. The focus is on cross-compilation of C, C++ and Objective C source code to various target platforms. Currently supported target platforms are iOS, Android NDK, Google Portable Native Client, MacOS X, Linux and Windows (MinGW). Supported host platforms are MacOS X, Linux and Windows.

  • methcla is their mobile sound engine, which is built using this Shake script, which (unsurprisingly) uses shake-language-c and shake-cabal-build.

Wednesday, August 20, 2014

Continuations and Exceptions

Summary: In moving Shake to continuations, exceptions were the biggest headache. I figured out how to somewhat integrate continuations and exception handling.

The git repo for Shake now suspends inactive computations by capturing their continuation instead of blocking their thread, based on the continuations I described in a previous blog post. The most difficult part was managing exceptions. I needed to define a monad where I could capture continuations and work with exceptions, requiring the definitions:

data M a = ... deriving (Functor, Applicative, Monad, MonadIO)
throwM :: SomeException -> M a
catchM :: M a -> (SomeException -> M a) -> M a
captureM :: ((a -> IO ()) -> IO ()) -> M a

I'm using M as the name of the monad. I want equivalents of throwIO and catch for M, along with a function to capture continuations.

The first observation is that since catchM must catch any exceptions, including those thrown by users calling error, then throwM can be defined as:

throwM = liftIO . throwIO

Using throwIO gives better guarantees about when the exception is raised, compared to just throw.

The second observation is that sometimes I want to raise an exception on the continuation, rather than passing back a value. I can build that on top of captureM with:

captureM' :: ((Either SomeException a -> IO ()) -> IO ()) -> M a
captureM' k = either throwM return =<< captureM k

The third observation (which I observed after a few weeks trying not to follow it) is that the continuation may never be called, and that means you cannot implement a robust finallyM function. In particular, if the person who was intending to run the continuation themselves raises an exception, the continuation is likely to be lost. I originally tried to come up with schemes for defining the function passed the continuation to guarantee the continuation was called, but it became messy very quickly.

The properties we expect of the implementation, to a rough approximation, include:

  • catchM (x >> throwM e) (\_ -> y) >> z === x >> y >> z -- if you throw an exception inside a catchM, you must run the handler.
  • captureM (\k -> x) >>= y === x -- if you execute something not using the continuation inside captureM it must behave like it does outside captureM. In particular, if the captureM is inside a catchM, that catchM must not catch the exception.
  • captureM (\k -> k x) >>= y === x >>= y -- if you capture the continuation then continue that must be equivalent to not capturing the continuation.
  • captureM (\k -> k x >> k x) >>= y === (x >>= y) >> (x >>= y) -- if you run the continuation twice it must do the same IO actions each time. In particular, if the first gets its exceptions caught, the second must do also.

These properties are incomplete (there are other things you expect), and fuzzy (for example, the second property isn't type correct) - but hopefully they give an intuition.

The implementation was non-trivial and (sadly) non-elegant. I suspect a better implementation is known in the literature, and I'd welcome a pointer. My implementation defines M as:

type M a = ContT () (ReaderT (IORef (SomeException -> IO ())) IO) a

Here we have a continuation monad wrapping a reader monad. The reader contains an IORef which stores the exception handler. The basic idea is that whenever we start running anything in M we call the Haskell catch function, and the exception handler forwards to the IORef. We can define catchM as:

catchM :: M a -> (SomeException -> M a) -> M a
catchM m hdl = ContT $ \k -> ReaderT $ \s -> do
    old <- liftIO $ readIORef s
    writeIORef s $ \e -> do
        s <- newIORef old
        hdl e `runContT` k `runReaderT` s `catch`
            \e -> ($ e) =<< readIORef s
    flip runReaderT s $ m `runContT` \v -> do
        s <- ask
        liftIO $ writeIORef s old
        k v
  • We store the previous exception handler as old, and insert a new one. After the code has finished (we have left the catchM block) we restore the old exception handler. In effect, we have a stack of exception handlers.
  • When running the handler we pop off the current exception handler by restoring old, then since we have already used up our catch, we add a new catch to catch exceptions in the handler.

We then define captureM as:

captureM :: ((a -> IO ()) -> IO ()) -> M a
captureM f = ContT $ \k -> ReaderT $ \s -> do
    old <- readIORef s
    writeIORef s throwIO
    f $ \x -> do
        s <- newIORef old
        flip runReaderT s (k x) `E.catch`
            \e -> ($ e) =<< readIORef s
        writeIORef s throwIO
  • We make sure to switch the IORef back to throwIO before we start running the users code, and after we have finished running our code and switch back to user code. As a result, if the function that captures the continuation throws an exception, it will be raised as normal.
  • When running the continuation we create a new IORef for the handler, since the continuation might be called twice in parallel, and the separate IORef ensures they don't conflict with each other.

Finally, we need a way to run the computation. I've called that runM:

runM :: M a -> (Either SomeException a -> IO ()) -> IO ()
runM m k = do
    let mm = do
            captureM $ \k -> k ()
            catchM (Right <$> m) (return . Left)
    s <- newIORef throwIO
    mm `runContT` (liftIO . k) `runReaderT` s

The signature of runM ends up being the only signature the makes sense given the underlying mechanisms. We define mm by using the facilities of captureM to insert a catch and catchM to ensure we never end up in an exception state from runM. The rest is just matching up the types.

Stack depth could potentially become a problem with this solution. If you regularly do:

captureM (\k -> k ())

Then each time a catch will be wrapped around the function. You can avoid that by changing captureM to throw an exception:

captureM :: ((a -> IO ()) -> IO ()) -> M a
captureM f = ContT $ \k -> ReaderT $ \s -> do
    old <- readIORef s
    writeIORef s $ \_ ->
        f $ \x -> do
            s <- newIORef old
            flip runReaderT s (k x) `E.catch`
                \e -> ($ e) =<< readIORef s
    throwIO anyException

Here we unwind the catch by doing a throwIO, after installing our exception handler which actually passes the continuation. It is a bit ugly, and I haven't checked if either the catch is a problem, or that this solution solves it.

The implementation in Shake is a bit different to that described above. In Shake I know that captured continuations are never called more than once, so I
can avoid creating a new IORef in captureM, and I can reuse the existing one. Since I never change the handler, I can use a slightly less powerful definition of M:

type M a = ReaderT (IORef (SomeException -> IO ())) (ContT () IO) a

The resulting code is Development.Shake.Monad, which implements the RAW monad, and also does a few extra things which are irrelevant to this post.

The cool thing about Haskell is that I've been able to completely replace the underlying Shake Action monad from StateT/IO, to ReaderT/IO, to ReaderT/ContT/IO, without ever breaking any users of Shake. Haskell allows me to produce effective and flexible abstractions.

Tuesday, August 12, 2014

Safe library - generalising functions

Summary: The Safe library now has exact versions of take/drop, with twelve functions implemented on top of a generalised splitAt.

The Safe library is a simple Haskell library that provides versions of standard Prelude and Data.List functions that usually throw errors (e.g. tail), but wrapped to provide better error messages (e.g. tailNote), default values (e.g. tailDef) and Maybe results (e.g. tailMay).

I recently released version 0.3.5, which provides a new module Safe.Exact containing crashing versions of functions such as zip/zipWith (which error if the lists are not equal length) and take/drop/splitAt (which error if there are not enough elements), then wraps them to provide safe variants. As an example, the library provides:

takeExact    :: Int -> [a] -> [a]
takeExactMay :: Int -> [a] -> Maybe [a]

These are like take, but if the Int is larger than the length of the list it will throw an error or return Nothing. Some sample evaluations:

takeExactMay 2 [1,2,3] == Just [1,2]
takeExact    2 [1,2,3] == [1,2]
takeExactMay 2 [1] == Nothing
takeExact    2 [1] ==
    1:error "Safe.Exact.takeExact, index too large, index=2, length=1"
take 1 (takeExact 2 [1]) == [1]

So takeExactMay computes up-front whether the whole computation will succeed, and returns a Nothing if it will fail. In contrast, takeExact produces elements while they are present, but if you demand an additional element that is missing it will result in an error. All the exceptions in the Safe library are designed to provide the maximum level of detail about what went wrong, here telling us the index we were after and the length of the list.

The library provides takeExact, dropExact and splitAtExact, plus Def/May/Note versions, resulting in twelve similar functions. While the implementation of any one function is reasonably short (although not that short, once proper error messages are provided), I didn't want to write the same code twelve times. However, generalising over functions that check up-front and those that check on-demand requires a bit of thought. In the end I settled for:

splitAtExact_ :: (String -> r) -> ([a] -> r) -> (a -> r -> r) -> Int -> [a] -> r
splitAtExact_ err nil cons o xs
    | o < 0 = err $ "index must not be negative, index=" ++ show o
    | otherwise = f o xs
    where
        f 0 xs = nil xs
        f i (x:xs) = x `cons` f (i-1) xs
        f i [] = err $
            "index too large, index=" ++ show o ++ ", length=" ++ show (o-i)

Here the splitAtExact_ function has a parameterised return type r, along with three functional arguments that construct and consume the r values. The functional arguments are:

  • err :: String -> r, says how to convert an error into a result value. For up-front checks this produces a Nothing, for on-demand checks this calls error.
  • nil :: [a] -> r, says what to do once we have consumed the full number of elements. For take we discard all the remaining elements, for drop we are only interested in the remaining elements.
  • cons :: a -> r -> r, says how to deal with one element before we reach the index. For take this will be (:), but for functions producing a Maybe we have to check the r parameter first.

With this generalisation, I was able to write all twelve variants. As a few examples:

addNote fun msg = error $ "Safe.Exact." ++ fun ++ ", " ++ msg

takeExact = splitAtExact_ (addNote "takeExact") (const []) (:)

dropExact = splitAtExact_ (addNote "dropExact") id (flip const)

takeExactMay = splitAtExact_ (const Nothing) (const $ Just []) (\a -> fmap (a:))

dropExactMay = splitAtExact_ (const Nothing) Just (flip const)

splitAtExact = splitAtExact_ (addNote "splitAtExact")
    (\x -> ([], x)) (\a b -> first (a:) b)

splitAtExactMay = splitAtExact_ (const Nothing)
    (\x -> Just ([], x)) (\a b -> fmap (first (a:)) b)

Normally I would have defined takeExact and dropExact in terms of fst/snd on top of splitAtExact. However, in the Safe library error messages are of paramount importance, so I go to additional effort to ensure the error says takeExact and not splitAtExact.

Saturday, July 26, 2014

Converting Make to Shake

Summary: I have converted over 10,000 lines from Make to Shake. Here are some tips I learnt along the way.

Make is the de facto build system for large projects - if no one made an active choice, your project is probably using Make. The Shake build system can be a better alternative, but how should you convert? The following tips are based on my experience converting a 10,000 line Make system to Shake.

Shake can do whatever Make can

Shake is more powerful than Make, so if Make could do something, Shake probably can too. As a first approximation, the Make snippet:

output: input1 input2
    shell command to run

Becomes:

"output" *> \out -> do
    need ["input1","input2"]
    cmd Shell "shell command to run"

In addition:

  • Variables in Make usually map to normal Haskell variables.
  • Definitions of rules and dependencies use the functions from Development.Shake. For example, .PHONY maps to the phony function.
  • Filepath manipulation uses the functions from Development.Shake.FilePath.
  • Dynamically generated include files can be handled with needMakefileDependencies from Development.Shake.Util.

Preserve the file/directory structure

The existing Make system will generate object files with particular names in particular places. Often these locations aren't what you would pick if you wrote the build system afresh. However, resist the temptation to "clean up" these pieces during the conversion. Treat the file locations as a specification, which lets you focus on the conversion to Shake without simultaneously redesigning a large and complex build system.

Treat the Makefile as a black box

Often the existing Makefile will be hard to understand, and sometimes won't be worth reading at all. The most important information in the Makefile is what commands it runs, which can be determined by make clean && make -j1 > log.txt, which captures a complete list of the commands run. From the commands it is usually relatively easy to determine the inputs and outputs, from which you can write the Shake rules. However, the Makefile can be useful to figure out which commands to group into a single rule, and how to generalise rules to cover multiple files.

Split the metadata from the logic

Often the Makefiles combine metadata (these object files go into this executable) with logic (use gcc -O2 to build all executables). Shake is great for writing build logic, but metadata is often better placed in separate files (the Haskell syntax can be a little heavy). You can use the full power of Haskell to store whatever metadata you require, and addOracle from Shake can introduce granular dependencies on the information. The module Development.Shake.Config provides some helper functions that might serve as a suitable base.

To bootstrap the Shake system, often the metadata can be extracted from the existing Makefiles. You can write a temporary script to parse the Makefile and extract whatever you consider the metadata, clean it up, and write it to new configuration files. Initially the config files are generated, but once you delete the Make original, they become source files.

Focus on a single platform/configuration

Often a build system will be cross-platform (Linux/Mac/Windows), build multiple targets (binaries/distribution package/documentation) and build multiple configurations (release/debug/profile). To start the conversion, focus only on the most heavily developed platform/configuration - if the migration is successful, abstracting over the differences is far easier in Shake than Make. You may wish to start with a simple target to try out Shake (e.g. documentation), but after that work on the target developers use every day, so that the developers can make use of the improvements sooner, motivating the migration.

Convert bottom up

Shake demands that it built all the dependencies (it checks the modification time is equal to what it remembered), in contrast Make only requires that targets are newer than their dependencies. As a result, you should start converting the leaves of the build system to Shake, and work upwards. Provided you use the same file/directory structure, you can then build what you have defined with Shake, then finish the build with Make, checking the result still works as expected.

Run Make and Shake in parallel

One you have migrated enough of the build system to be useful (the usual targets in the most common configuration), you should encourage some developers to try Shake instead of Make. These developers will find things that don't work properly, hidden features in the Make system that no one knew about etc. Expect to fix problems and iterate several times.

Hopefully the Shake system will be faster and more robust. Once these advantages have encouraged all the main developers to make the switch, you should delete/disable the Make system and expect it to bitrot quickly.

Refactor individual rules

As you are converting rules from Make to Shake you can translate them directly and refactor later, or convert straight into more idiomatic Shake. As an example, you might start with:

cmd Shell "ls >" out

The argument Shell tells Shake to use the system shell, meaning that > redirect works. Later on you may wish to switch to:

Stdout result <- cmd "ls"
writeFile' out result

Now you are invoking the ls command directly, capturing the output using Shake. Sometime later you may switch to:

getDirectoryFiles "." ["*"]

Which is the Shake tracked way of getting a list of files. Similarly, calling sed or for through Shell should probably be gradually converted to Shake/Haskell operations.

Refactor the whole

Once you have converted the whole build system, and disabled the original Make system, you may wish to refactor the build system - putting files in more appropriate places, rethinking file dependencies etc. In truth, I've never got round to this step, and I would be surprised if many people did. However, as the build system grows, hopefully the new bits with sensible decisions will gradually begin to outnumber the old bits with questionable design.

Ask if you get stuck

Build systems (even in Shake) are complex entities, with intricate coordination between files, which mostly run untyped external commands with many platform/version differences. As a result, build systems are often complex to write.

If you have a problem using Shake, just ask. If you can boil down the problem to something fairly standalone, ask on StackOverflow with the tag shake-build-system. If you are looking for more general advice, ask on the mailing list. If you succeed, write a blog post and tweet me.