Showing posts with label haskell. Show all posts
Showing posts with label haskell. Show all posts

Sunday, June 2, 2013

Control.Category: now with kind polymorphism

Just a quick update.

The other day, I pushed a patch to the Haskell base library that will enable kind polymorphism for the Category type class. That change will come to the public when GHC 7.8.1 is released.

This is a simple change that makes Category far more general, while not breaking any code already out there. (The change was originally proposed by none other than Edward Kmett, although he’s not the only one wanting it.)

Out with the old, in with the new

The typical definition of Category looks like:

class Category cat where
  id  :: cat a a
  (.) :: cat b c -> cat a b -> cat a c

Which is all great. But there’s a problem with this. By default, the type variable cat has a kind something like (cat :: * -> * -> *) - meaning a Category can only be created from types where a and b are of kind *.

This isn’t very rich or expressive. In fact, there are many legal Category instances which cannot be created with the above definition, as the parameters (a and b) are of a more exotic kind - for example, not of kind * but of kind (*,*) - the kind of promoted tuples.

Luckily for us, GHC has kind polymorphism. So we just needed to enable it. This is a completely backwards compatible change as well.

With that little addition of LANGUAGE PolyKinds, we can now host much richer instances of Category. With -XPolyKinds, kind generalization occurs automatically. So at pretty much no cost, the change turned:

class Category (cat :: * -> * -> *) where

into:

class Category (cat :: k -> k -> *) where

where the k stands for a kind variable, and can be instantiated to various new kinds, not just *.

In GHC 7.8, if you ask for the kind of Category, you will get:

Prelude> import Control.Category
Prelude Control.Category> :i Category
class Category (k :: BOX) (cat :: k -> k -> *) where
  Control.Category.id :: cat a a
  (Control.Category..) :: cat b c -> cat a b -> cat a c
       -- Defined in ‛Control.Category’
instance Category * (->) -- Defined in ‛Control.Category’
Prelude Control.Category> :k Category
Category :: (k -> k -> *) -> Constraint
Prelude Control.Category>

(For the curious: the (k :: BOX) annotation cannot be given explicitly, and is a form introduced as a result of PolyKinds, where kind parameters are introduced ‘implicitly’ in their definition.)

Logical entailment

There’s a simple instance of Category which we already know and love: the instance for function types. Let the type variable cat become (->) and we get:

id :: cat a a
  = { given the equality, cat = (->) }
id :: (->) a a
  = { by syntactic equality }
id :: a -> a

(.) :: cat b c -> cat a b -> cat a c
  = { given the equality, cat = (->) }
(.) :: ((->) b c) -> ((->) a b) -> ((->) a c)
  = { by syntactic equality }
(.) :: (b -> c) -> (a -> b) -> (a -> c)

And these are the regular types of Prelude.id and (Prelude..), so:

instance Category (->) where
  id  = Prelude.id
  (.) = (Prelude..)

Another example Edward brought up though, is logical entailment (or logical implication.) This is an entertaining example because it’s easy and looks very similar to the above types. With modern GHC features, we can formulate this using the new Constraint kind.

Let Dict be the ultimate reified dictionary, using a GADT:

data Dict :: Constraint -> * where
  Dict :: ctx => Dict ctx

Given a term of type Dict x, we can witness the constraint x. Max Bolingbroke wrote some stuff about this trick when he implemented ConstraintKinds.

Now, let the type a |- b read ‘a entails b’, implemented as:

newtype (|-) a b = Sub (a => Dict b)

I’ll get to the explanation in a moment. If we peer through the looking glass we can see the term Sub Dict as a kind of evidence that some statement a |- b holds. So now there are some obvious statements we can make about entailment. For one, there is the simple property of reflexivity: “if a, then a”, or in other words a |- a:

reflexive :: a |- a
reflexive = Sub Dict

Another property is transitivity. Given the implication a |- b, and b |- c, we can derive the statement a |- c.

First, we must introduce a helper. Given a term r which requires context b, we may derive a term r requiring context a, iff a |- b.

In other words, we may substitute the constraint under which a given term occurs:

(\\) :: a => (b => r) -> (a |- b) -> r
r \\ Sub Dict = r

Remember the definition of (|-) which is Sub (a => Dict b). Given a dictionary of b, we automatically derive the constituent a for the constraint. Dict will imply the constraint b, but Dict is a term, and thus we can ‘peel it off’ by abandoning the Dict like the above, leaving only a.

Now we can define the property of transitivity:

transitive :: (b |- c) -> (a |- b) -> (a |- c)
transitive f g = Sub (Dict \\ f \\ g)

We can view the transitive function as taking two pieces of evidence and returning a third. We use the helper (\\) to ‘build’ the final evidence piecewise: first we get Dict \\ f which replaces c for b, and then we replace b for the constraint a. This then ‘trivially’ witnesses the fact we can replace c for a.

But these definitions are the exact instances of Category for (|-), and they directly mirror the instances for (->) too:

id        :: a -> a
reflexive :: a |- a

(.)        :: (b -> c) -> (a -> b) -> (a -> c)
transitive :: (b |- c) -> (a |- b) -> (a |- c)

And therefore:

instance Category (|-) where
  id  = reflexive
  (.) = transitive

Note that the kind of entailment is: (|-) :: Constraint -> Constraint -> *

With a kind-polymorphic Category, the type (|-) can now be a valid instance where it couldn’t be before.

By the way, the above definitions are all available as part of Edward’s constraints package.

Conclusion

This change was proposed a while ago, and could have been implemented as early as GHC 7.4 technically. But it’s a nice addition to top off what will already be an awesome 7.8 release.

Full code can be found here.

Sunday, February 10, 2013

Controlling inlining phases in GHC

Recently on StackOverflow there was a question about controlling inlining phases in GHC. I set out to answer that question, and I decided that it was interesting enough to post here. Below is my answer slightly modified to fit the context of this blog.

Introduction

You may have seen phase control and inlining annotations in libraries like vector, repa and dph before, but how do they work? It’s nice to see a cut-down and concrete example of where using phase control in combination with RULES/INLINE is beneficial.1 You don’t see them beyond heavily optimized libraries which are often complex, so case studies are great.

Here is an example I implemented recently, using recursion schemes. We will illustrate this using the concept of catamorphisms. You don’t need to know what those are in detail, just that they characterize ‘fold’ operators. (Really, do not focus too much on the abstract concepts here. This is just the simplest example I have, where you can have a nice speed-up.)

Quick intro to catamorphisms

We begin with Mu, the fix-point type, and a definition of Algebra which is just a fancy synonym for a function which “deconstructs” a value of f a to return an a.

newtype Mu f = Mu { muF :: f (Mu f) }

type Algebra f a = f a -> a

We may now define two operators, ffold and fbuild, which are highly-generic versions of the traditional foldr and build operators for lists:

ffold :: Functor f => Algebra f a -> Mu f -> a
ffold h = go h 
  where go g = g . fmap (go g) . muF
{-# INLINE ffold #-}

fbuild :: Functor f => (forall b. Algebra f b -> b) -> Mu f
fbuild g = g Mu
{-# INLINE fbuild #-}

Roughly speaking, ffold destroys a structure defined by an Algebra f a and yields an a. fbuild instead creates a structure defined by its Algebra f a and yields a Mu value. That Mu value corresponds to whatever recursive data type you’re talking about. Just like regular foldr and build: we deconstruct a list using its cons, and we build a list using its cons, too. The idea is we’ve just generalized these classic operators, so they can work over any recursive data type (like lists, or trees!)

Finally, there is a law that accompanies these two operators, which will guide our overall RULE:

forall f g. ffold f (build g) = g f

This rule essentially generalizes the optimization of deforestation/fusion - the removal of the intermediate structure. (I suppose the proof of correctness of said law is left as an exercise to the reader. Should be rather easy via equational reasoning.)

We may now use these two combinators, along with Mu, to represent recursive data types like a list. And we can write operations over that list.

data ListF a f = Nil | Cons a f
  deriving (Eq, Show, Functor)
type List a = Mu (ListF a)

instance Eq a => Eq (List a) where
  (Mu f) == (Mu g) = f == g

lengthL :: List a -> Int
lengthL = ffold g
  where g Nil = 0
        g (Cons _ f) = 1 + f
{-# INLINE lengthL #-}

And we can define a map function as well:

mapL :: (a -> b) -> List a -> List b
mapL f = ffold g
  where g Nil = Mu Nil
        g (Cons a x) = Mu (Cons (f a) x)
{-# INLINE mapL #-}

Inlining FTW

We now have a means of writing terms over these recursive types we defined. However, if we were to write a term like

lengthL . mapL (+1) $ xs

Then if we expand the definitions, we essentially get the composition of two ffold operators:

ffold g1 . ffold g2 $ ...

And that means we’re actually destroying the structure, then rebuilding it and destroying again. That’s really wasteful. Also, we can re-define mapL in terms of fbuild, so it will hopefully fuse with other functions.

Well, we already have our law, so a RULE is in order. Let’s codify that:

{-# RULES
-- Builder rule for catamorphisms
"ffold/fbuild" forall f (g :: forall b. Algebra f b -> b).
                  ffold f (fbuild g) = g f
#-}

Next, we’ll redefine mapL in terms of fbuild for fusion purposes:

mapL2 :: (a -> b) -> List a -> List b
mapL2 f xs = fbuild (\h -> ffold (h . g) xs)
  where g Nil = Nil
        g (Cons a x) = Cons (f a) x
{-# INLINE mapL2 #-}

Aaaaaand we’re done, right? Wrong!

Phases for fun and profit

The problem is there are zero constraints on when inlining occurs, which will completely mess this up. Consider the case earlier that we wanted to optimize:

lengthL . mapL2 (+1) $ xs

We would like the definitions of lengthL and mapL2 to be inlined, so that the ffold/fbuild rule may fire afterwords, over the body. So we want to go to:

ffold f1 . fbuild g1 ...

via inlining, and after that go to:

g1 f1

via our RULE.

Well, that’s not guaranteed. Essentially, in one phase of the simplifier, GHC may not only inline the definitions of lengthL and mapL, but it may also inline the definitions of ffold and fbuild at their use sites. This means that the RULE will never get a chance to fire, as the phase ‘gobbled up’ all of the relevant identifiers, and inlined them into nothing.

The observation is that we would like to inline ffold and fbuild as late as possible. By inlining their definitions as late as possible, we will try to expose as many possible opportunities for our RULE to fire. And if it doesn’t, then the body will get inlined, and GHC will still give it’s best. But ultimately, we want it to inline late; the RULE will save us more efficiency than any clever compiler optimization.

So the fix here is to annotate ffold and fbuild and specify they should only fire at phase 1:

ffold g = ...
{-# INLINE[1] ffold #-}

fbuild g = ...
{-# INLINE[1] fbuild #-}

Now, mapL and friends will be inlined very early, but these will come very late. GHC begins from some phase number N, and the phase numbers decrease to zero. Phase 1 is the last phase. It would also be possible to inline fbuild/ffold sooner than Phase 1, but this would essentially mean you need to start increasing the number of phases to make up for it, or start making sure the RULE always fires in some earlier stages.

Conclusion

You can find all of this and more in a gist of mine2, with all the mentioned definitions and examples here. It also comes with a criterion benchmark of our example: with our phase annotations, GHC is able to cut the runtime of lengthL . mapL2 in half compared to lengthL . mapL1, when the RULE fires.

If you would like to see this yourself, you can compile the code with the -ddump-simpl-stats, and see that the ffold/fbuild rule fired during the compilation pipeline.

Finally, most of the same principles apply to libraries like vector or bytestring. The trick is that you may have multiple levels of inlining here, and a lot more rules. This is because techniques like stream/array fusion tend to effectively fuse loops and reuse arrays - as opposed to here, where we just do classical deforestation, by removing an intermediate data structure. Depending on the traditional ‘pattern’ of code generated (say, due to a vectorized, parallel list comprehension) it may very much be worth it to interleave or specifically phase optimizations in a way that obvious deficiencies are eliminated earlier on. Or, optimize for cases where a RULE in combination with an INLINE will give rise to more RULEs (hence the staggered phases you see sometimes - this basically interleaves a phase of inlining.) For these reasons, you can also control the phases in which a RULE fires.

So, while RULEs with phases can save us a lot of runtime, they can take a lot of time to get right too. This is why you often see them only in the most ‘high performance’, heavily optimized libraries.

Footnotes


  1. The original question was “what kinds of functions benefit from phase control” which to me sounds like asking “which functions benefit from constant subexpression elimination.” I am not sure how to accurately answer this, if it’s even possible! This is more of a compiler-realm thing, than any theoretical result about how functions or programs behave - even with mathematical laws, not all ‘optimizations’ have the results you expect. As a result, the answer is effectively “you’ll probably know when you write and benchmark it.”

  2. You can safely ignore a lot of other stuff in the file; it was mostly a playground, but may be interesting to you too. There are other examples like naturals and binary trees in there - you may find it worthwhile to try exploiting various other fusion opportunities, using them.

Saturday, December 29, 2012

Phantom types for crypto keys

Phantom types and GADTs are well-known techniques in Haskell for helping build programs that enforce certain kinds of type refinement: they essentially are used to help serve witness to some type equality. This type equality generally enforces constraints on the nature of your code. In the case of GADTs, your witnesses are the data type constructors (which actually carry around an equality constraint,) and with phantom types, you typically build valid terms from an encapsulated data type, using provided APIs.

While working on salt I recently found the types I used for various key types to be a bit inexpressive. I want the library to be type safe in a way that avoids common misusage, and helps the user write correct code: the API should be explanatory and simple.

My old API looked like this:

newtype PublicKey = PublicKey { unPublicKey :: ByteString } deriving ...
newtype SecretKey = SecretKey { unSecretKey :: ByteString } deriving ...

type KeyPair = (PublicKey, SecretKey)

And then you had things like the public-key encryption API, which looked like:

createKeypair :: IO KeyPair
encrypt :: Nonce       -- Nonce
        -> ByteString  -- Plaintext
        -> PublicKey   -- Receiver public key
        -> SecretKey   -- Sender secret key
        -> ByteString  -- Ciphertext

I thought for a little bit and eventually came up with a key API that is shared for all interfaces, very minimal and a bit more explanatory. I took the idea from repa: to tag the type constructor with type variables, and use them for distinguishing the kind of data.

So the new API looks like this:

newtype Key s i = Key { unKey :: ByteString }

data Public
data Secret

type KeyPair i = (Key Public i, Key Secret i)

And now encryption looks like this:

data PublicEncryptionKey
createKeypair :: IO (KeyPair PublicEncryptionKey)
encrypt :: Nonce
        -> ByteString
        -> Key Public PublicEncryptionKey
        -> Key Secret PublicEncryptionKey
        -> ByteString

The API is a bit wordier, but for the most part, keys will be either parametric in the key types or the key type is fixed for an API, so there’s not too much need to be so specific I imagine.

I actually like this new API quite a bit because it’s easy to see at a glance what a key is for, and type-checking makes sure you won’t make simple mistakes. It’s also quite self-explanatory, I think.

I’m working on reducing the wordiness of imports and the APIs further, so hopefully this can be improved on.

Friday, July 22, 2011

Ur/Web at Hac φ

I'll be giving a short talk and presentation on Ur/Web at the next Haskell Hackathon, Hac φ. The talks are scheduled to be between 10-15 minutes, so I'm aiming to have a short spiel for the people who are there in person. But I'm also keeping the slides fairly complete and detailed as possible for those who are interested, but won't be able to make it. I know, slides aren't the best, but I don't think there will be any video recording for the talks, so I'll try to make them as self-contained as I can.

On that note, Pandoc is awesome.

If you're going to be there or want to talk, get in touch with me!

Edit: It's been done, and Hac Phi was a ton of fun! I didn't get the slides as complete as I wanted to (partially due to laziness, gaming and a million other things,) but for the interested, you can find the slides online here.

Tuesday, April 5, 2011

RC4 in Haskell

I've been meaning to try out Levent Erkok's sbv library, but for a while I couldn't think of anything to write. Now I have, and it was quite fun to implement: RC4 in pure Haskell. I took the original RC4 code from cypherpunks, and rewrote it using sbv.

It took a lot of tweaking to get my implementation right, but I think it works correctly now.

Everything, including implementation, haddock documententation, a few tests and a few theorems all weighs in at ~275 lines. Most of it I somewhat directly translated from the cypherpunks posting. I also included a ByteString interface that matches the RC4 code in the cryptocipher package, and an instance of crypto-api's StreamCipher class. So you can actually use it for encrypting things. I make no means of guaranteeing any sort of efficient implementation; indeed, it's probably somewhat slow. If I can think of a good benchmark, I'll add some. But I enjoyed writing it, and sbv is a very nice DSL that was pretty easy to get used to.

The nice thing? This:

*RC4> proveRc4IsCorrect
** Elapsed problem construction time: 16.360s
** Elapsed translation time: 0.000s
** Elapsed Yices time: 34.084s
Q.E.D.
*RC4>


Which showed my proposition decrypt (encrypt x) == x was correct for 40 bit keys and 40 bit plaintexts. I was quite glad that yices could actually verify this. But despite the fact my proposition holds according to the SMT solver, I also did some testing of my implementation by QuickCheck testing it against cryptocipher's RC4 module. Sure enough - my QuickCheck tests quickly turned up problems!

Moral: don't trust code that's proven and has not been tested, as Wouter said (PDF.) Of course in this instance, proving encryption followed by decryption is the identity function isn't a huge breakthrough, nor is the fact my tests showed errors surprising. Due to RC4's symmetric, pseudo-random nature, the generation of the keystream can be wrong and this proposition will still hold. This was where the bugs lied: keystream generation. Nonetheless, just getting a proposition and blasting it with the solver to be told it's correct is quite satisfying and it did save me once or twice - I really like this SMT solver approach.

I haven't done the C code generation yet. It mainly has to do with the fact the interface would be horrendous because the generated C code cannot use arrays. Levent is working on a new version of sbv that will let you generate code using arrays though (we need this for the S state of the RC4 context, which is a 256 byte array.)

Anyway, here's the code. It also includes my version of the original C implementation from cypherpunks - mainly just a simple driver program with a test inside of it. Caveat emptor: I am not a cryptographer, issues may exist, etc.

I imagine with more compiler backends, it could be possible to have libraries of code written in Haskell using sbv that can then be verified, and extracted to other languages for use - it will also hopefully be extended to use other SMT solvers directly besides Yices. Maybe in future posts I'll write some tutorials for it or something.

Monday, March 14, 2011

QuickLZ compression for haskell

I've finished off a binding to QuickLZ for Haskell, available here. It has a very simple interface and allows you to compress strict bytestrings very easily. It uses QuickLZ v1.5.0, compression level 1, and has support for overlapping decompression too.

Currently there's no implementation for lazy bytestrings. I'm looking into enabling the streaming mode, so you can have a lazy interface, and also wiring up the result to John's enumerator package so you can write enumerator based compression code. I'm currently hitting a strange segfault however when defining a non-zero QLZ_STREAMING_BUFFER flag for the QuickLZ code, so I'll have to investigate that (and make sure it's not just me doing something dumb.)

Comments welcome.