Data.Binary.Get
Copyright | Lennart Kolmodin |
---|---|
License | BSD3-style (see LICENSE) |
Maintainer | Lennart Kolmodin <[email protected]> |
Stability | experimental |
Portability | portable to Hugs and GHC. |
Safe Haskell | Trustworthy |
Language | Haskell98 |
Description
The Get
monad. A monad for efficiently building structures from encoded lazy ByteStrings.
Primitives are available to decode words of various sizes, both big and little endian.
Let's decode binary data representing illustrated here. In this example the values are in little endian.
+------------------+--------------+-----------------+ | 32 bit timestamp | 32 bit price | 16 bit quantity | +------------------+--------------+-----------------+
A corresponding Haskell value looks like this:
data Trade = Trade { timestamp :: !Word32 , price :: !Word32 , qty :: !Word16 } deriving (Show)
The fields in Trade
are marked as strict (using !
) since we don't need laziness here. In practise, you would probably consider using the UNPACK pragma as well. https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#unpack-pragma
Now, let's have a look at a decoder for this format.
getTrade :: Get Trade getTrade = do timestamp <- getWord32le price <- getWord32le quantity <- getWord16le return $! Trade timestamp price quantity
Or even simpler using applicative style:
getTrade' :: Get Trade getTrade' = Trade <$> getWord32le <*> getWord32le <*> getWord16le
There are two kinds of ways to execute this decoder, the lazy input method and the incremental input method. Here we will use the lazy input method.
Let's first define a function that decodes many Trade
s.
getTrades :: Get [Trade] getTrades = do empty <- isEmpty if empty then return [] else do trade <- getTrade trades <- getTrades return (trade:trades)
Finally, we run the decoder:
lazyIOExample :: IO [Trade] lazyIOExample = do input <- BL.readFile "trades.bin" return (runGet getTrades input)
This decoder has the downside that it will need to read all the input before it can return. On the other hand, it will not return anything until it knows it could decode without any decoder errors.
You could also refactor to a left-fold, to decode in a more streaming fashion, and get the following decoder. It will start to return data without knowing that it can decode all input.
incrementalExample :: BL.ByteString -> [Trade] incrementalExample input0 = go decoder input0 where decoder = runGetIncremental getTrade go :: Decoder Trade -> BL.ByteString -> [Trade] go (Done leftover _consumed trade) input = trade : go decoder (BL.chunk leftover input) go (Partial k) input = go (k . takeHeadChunk $ input) (dropHeadChunk input) go (Fail _leftover _consumed msg) _input = error msg takeHeadChunk :: BL.ByteString -> Maybe BS.ByteString takeHeadChunk lbs = case lbs of (BL.Chunk bs _) -> Just bs _ -> Nothing dropHeadChunk :: BL.ByteString -> BL.ByteString dropHeadChunk lbs = case lbs of (BL.Chunk _ lbs') -> lbs' _ -> BL.Empty
The lazyIOExample
uses lazy I/O to read the file from the disk, which is not suitable in all applications, and certainly not if you need to read from a socket which has higher likelihood to fail. To address these needs, use the incremental input method like in incrementalExample
. For an example of how to read incrementally from a Handle, see the implementation of decodeFileOrFail
.
The Get monad
Instances
Monad Get | |
Functor Get | |
MonadFail Get | |
Defined in Data.Binary.Get.Internal | |
Applicative Get | |
Alternative Get | Since: binary-0.7.0.0 |
MonadPlus Get | Since: binary-0.7.1.0 |
The lazy input interface
The lazy interface consumes a single lazy ByteString
. It's the easiest interface to get started with, but it doesn't support interleaving I/O and parsing, unless lazy I/O is used.
There is no way to provide more input other than the initial data. To be able to incrementally give more data, see the incremental input interface.
runGet :: Get a -> ByteString -> a Source
The simplest interface to run a Get
decoder. If the decoder runs into an error, calls fail
, or runs out of input, it will call error
.
runGetOrFail :: Get a -> ByteString -> Either (ByteString, ByteOffset, String) (ByteString, ByteOffset, a) Source
Run a Get
monad and return Left
on failure and Right
on success. In both cases any unconsumed input and the number of bytes consumed is returned. In the case of failure, a human-readable error message is included as well.
Since: binary-0.6.4.0
type ByteOffset = Int64 Source
An offset, counted in bytes.
The incremental input interface
The incremental interface gives you more control over how input is provided during parsing. This lets you e.g. interleave parsing and I/O.
The incremental interface consumes a strict ByteString
at a time, each being part of the total amount of input. If your decoder needs more input to finish it will return a Partial
with a continuation. If there is no more input, provide it Nothing
.
Fail
will be returned if it runs into an error, together with a message, the position and the remaining input. If it succeeds it will return Done
with the resulting value, the position and the remaining input.
A decoder procuced by running a Get
monad.
Constructors
Fail !ByteString !ByteOffset String | The decoder ran into an error. The decoder either used |
Partial (Maybe ByteString -> Decoder a) | The decoder has consumed the available input and needs more to continue. Provide |
Done !ByteString !ByteOffset a | The decoder has successfully finished. Except for the output value you also get any unused input as well as the number of bytes consumed. |
runGetIncremental :: Get a -> Decoder a Source
Run a Get
monad. See Decoder
for what to do next, like providing input, handling decoder errors and to get the output value. Hint: Use the helper functions pushChunk
, pushChunks
and pushEndOfInput
.
Providing input
pushChunk :: Decoder a -> ByteString -> Decoder a Source
Feed a Decoder
with more input. If the Decoder
is Done
or Fail
it will add the input to ByteString
of unconsumed input.
runGetIncremental myParser `pushChunk` myInput1 `pushChunk` myInput2
pushChunks :: Decoder a -> ByteString -> Decoder a Source
Feed a Decoder
with more input. If the Decoder
is Done
or Fail
it will add the input to ByteString
of unconsumed input.
runGetIncremental myParser `pushChunks` myLazyByteString
pushEndOfInput :: Decoder a -> Decoder a Source
Tell a Decoder
that there is no more input. This passes Nothing
to a Partial
decoder, otherwise returns the decoder unchanged.
Decoding
Skip ahead n
bytes. Fails if fewer than n
bytes are available.
Test whether all input has been consumed, i.e. there are no remaining undecoded bytes.
Get the total number of bytes read to this point.
Isolate a decoder to operate with a fixed number of bytes, and fail if fewer bytes were consumed, or more bytes were attempted to be consumed. If the given decoder fails, isolate
will also fail. Offset from bytesRead
will be relative to the start of isolate
, not the absolute of the input.
Since: binary-0.7.2.0
lookAhead :: Get a -> Get a Source
Run the given decoder, but without consuming its input. If the given decoder fails, then so will this function.
Since: binary-0.7.0.0
lookAheadM :: Get (Maybe a) -> Get (Maybe a) Source
Run the given decoder, and only consume its input if it returns Just
. If Nothing
is returned, the input will be unconsumed. If the given decoder fails, then so will this function.
Since: binary-0.7.0.0
lookAheadE :: Get (Either a b) -> Get (Either a b) Source
Run the given decoder, and only consume its input if it returns Right
. If Left
is returned, the input will be unconsumed. If the given decoder fails, then so will this function.
Since: binary-0.7.1.0
label :: String -> Get a -> Get a Source
Label a decoder. If the decoder fails, the label will be appended on a new line to the error message string.
Since: binary-0.7.2.0
ByteStrings
getByteString :: Int -> Get ByteString Source
An efficient get method for strict ByteStrings. Fails if fewer than n
bytes are left in the input. If n <= 0
then the empty string is returned.
getLazyByteString :: Int64 -> Get ByteString Source
An efficient get method for lazy ByteStrings. Fails if fewer than n
bytes are left in the input.
getLazyByteStringNul :: Get ByteString Source
Get a lazy ByteString that is terminated with a NUL byte. The returned string does not contain the NUL byte. Fails if it reaches the end of input without finding a NUL.
getRemainingLazyByteString :: Get ByteString Source
Get the remaining bytes as a lazy ByteString. Note that this can be an expensive function to use as it forces reading all input and keeping the string in-memory.
Decoding Words
Read a Word8 from the monad state
Big-endian decoding
getWord16be :: Get Word16 Source
Read a Word16 in big endian format
getWord32be :: Get Word32 Source
Read a Word32 in big endian format
getWord64be :: Get Word64 Source
Read a Word64 in big endian format
Little-endian decoding
getWord16le :: Get Word16 Source
Read a Word16 in little endian format
getWord32le :: Get Word32 Source
Read a Word32 in little endian format
getWord64le :: Get Word64 Source
Read a Word64 in little endian format
Host-endian, unaligned decoding
getWordhost :: Get Word Source
O(1). Read a single native machine word. The word is read in host order, host endian form, for the machine you're on. On a 64 bit machine the Word is an 8 byte value, on a 32 bit machine, 4 bytes.
getWord16host :: Get Word16 Source
O(1). Read a 2 byte Word16 in native host order and host endianness.
getWord32host :: Get Word32 Source
O(1). Read a Word32 in native host order and host endianness.
getWord64host :: Get Word64 Source
O(1). Read a Word64 in native host order and host endianess.
Decoding Ints
Read an Int8 from the monad state
Big-endian decoding
getInt16be :: Get Int16 Source
Read an Int16 in big endian format.
getInt32be :: Get Int32 Source
Read an Int32 in big endian format.
getInt64be :: Get Int64 Source
Read an Int64 in big endian format.
Little-endian decoding
getInt16le :: Get Int16 Source
Read an Int16 in little endian format.
getInt32le :: Get Int32 Source
Read an Int32 in little endian format.
getInt64le :: Get Int64 Source
Read an Int64 in little endian format.
Host-endian, unaligned decoding
getInthost :: Get Int Source
O(1). Read a single native machine word in native host order. It works in the same way as getWordhost
.
getInt16host :: Get Int16 Source
O(1). Read a 2 byte Int16 in native host order and host endianness.
getInt32host :: Get Int32 Source
O(1). Read an Int32 in native host order and host endianness.
getInt64host :: Get Int64 Source
O(1). Read an Int64 in native host order and host endianess.
Decoding Floats/Doubles
getFloatbe :: Get Float Source
Read a Float
in big endian IEEE-754 format.
getFloatle :: Get Float Source
Read a Float
in little endian IEEE-754 format.
getFloathost :: Get Float Source
Read a Float
in IEEE-754 format and host endian.
getDoublebe :: Get Double Source
Read a Double
in big endian IEEE-754 format.
getDoublele :: Get Double Source
Read a Double
in little endian IEEE-754 format.
getDoublehost :: Get Double Source
Read a Double
in IEEE-754 format and host endian.
Deprecated functions
runGetState :: Get a -> ByteString -> ByteOffset -> (a, ByteString, ByteOffset) Source
Deprecated: Use runGetIncremental instead. This function will be removed.
DEPRECATED. Provides compatibility with previous versions of this library. Run a Get
monad and return a tuple with three values. The first value is the result of the decoder. The second and third are the unused input, and the number of consumed bytes.
Deprecated: This will force all remaining input, don't use it.
DEPRECATED. Get the number of bytes of remaining input. Note that this is an expensive function to use as in order to calculate how much input remains, all input has to be read and kept in-memory. The decoder keeps the input as a strict bytestring, so you are likely better off by calculating the remaining input in another way.
getBytes :: Int -> Get ByteString Source
Deprecated: Use getByteString
instead of getBytes
.
DEPRECATED. Same as getByteString
.
© The University of Glasgow and others
Licensed under a BSD-style license (see top of the page).
https://downloads.haskell.org/~ghc/8.10.2/docs/html/libraries/binary-0.8.8.0/Data-Binary-Get.html