Format strings in OCaml

format-strings-in-ocaml

OCAML doesn’t have string interpolation, but it does have C-style format strings (but type-safe). Here’s an example:

let hello name = Printf.printf "Hello, %s!n" name
(* Can be written as: let hello = Printf.printf "Hello, %s!" *)

This is type-safe in an almost magical way (example REPL session):

# hello 1;;
Error: This expression has type int but an expression was expected of type
         string

It can however be a little tricky to wrap your head around:

# let bob = "Bob";;
val bob : string = "Bob"

# Printf.printf bob;;
Error: This expression has type string but an expression was expected of type
         ('a, out_channel, unit) format =
           ('a, out_channel, unit, unit, unit, unit) format6

This error is saying that the printf function wants a ‘format string’, which is distinct from a regular string:

# let bob = format_of_string "bob";;
val bob : ('_weak1, '_weak2, '_weak3, '_weak4, '_weak4, '_weak1) format6 =
  CamlinternalFormatBasics.Format
   (CamlinternalFormatBasics.String_literal ("bob",
     CamlinternalFormatBasics.End_of_format),
   "bob")

# Printf.printf bob;;
bob- : unit = ()

OCaml distinguishes between regular strings and format strings. The latter are complex structures which encode type information inside them. They are parsed and turned into these structures either when the compiler sees a string literal and’realizes’ that a format string is expected, or when you (the programmer) explicitly asks for the conversion. Another example:

# let fmt = "Hello, %s!n" ^^ "";;
val fmt :
  (string -> '_weak5, '_weak6, '_weak7, '_weak8, '_weak8, '_weak5) format6 =
  CamlinternalFormatBasics.Format
   (CamlinternalFormatBasics.String_literal ("Hello, ",
     CamlinternalFormatBasics.String (CamlinternalFormatBasics.No_padding,
      CamlinternalFormatBasics.String_literal ("!n",
       CamlinternalFormatBasics.End_of_format))),
   "Hello, %s!n%,")

# Printf.printf fmt "Bob";;
Hello, Bob!
- : unit = ()

The ^^ operator is the format string concatenation operator. Think of it as a more powerful version of the string concatenation operator, ^. It can concatenate either format strings that have already been bound to a name, or string literals which it interprets as format strings:

# bob ^^ bob;;
- : (unit, out_channel, unit, unit, unit, unit) format6 =
CamlinternalFormatBasics.Format
 (CamlinternalFormatBasics.String_literal ("bob",
   CamlinternalFormatBasics.String_literal ("bob",
    CamlinternalFormatBasics.End_of_format)),
 "bob%,bob")

# bob ^^ "!";;
- : (unit, out_channel, unit, unit, unit, unit) format6 =
CamlinternalFormatBasics.Format
 (CamlinternalFormatBasics.String_literal ("bob",
   CamlinternalFormatBasics.Char_literal ('!',
    CamlinternalFormatBasics.End_of_format)),
 "bob%,!")

Custom formatting functions

The really amazing thing about format strings is that you can define your own functions which use them to output formatted text. For example:

# let shout fmt = Printf.ksprintf (fun s -> s ^ "!") fmt;;
val shout : ('a, unit, string, string) format4 -> 'a = 

# shout "hello";;
- : string = "hello!"

# let jim = "Jim";;
val jim : string = "Jim"

# shout "Hello, %s" jim;;
- : string = "Hello, Jim!"

This is really just a simple example; you actually are not restricted to outputting only strings from ksprintf. You can output any data structure you like. Think of ksprintf as ‘(k)ontinuation-based sprintf’; in other words, it takes a format string (fmt), any arguments needed by the format string (eg jim), builds the output string, then passes it to the continuation that you provide (fun s -> ...), in which you can build any value you want. This value will be the final output value of the function call.

Again, this is just as type-safe as the basic printf function:

# shout "Hello, jim" jim;;
Error: This expression has type
         ('a -> 'b, unit, string, string, string, 'a -> 'b)
         CamlinternalFormatBasics.fmt
       but an expression was expected of type
         ('a -> 'b, unit, string, string, string, string)
         CamlinternalFormatBasics.fmt
       Type 'a -> 'b is not compatible with type string

This error message looks a bit scary, but the real clue here is in the last line: an extra string argument was passed in, but it was expecting 'a -> 'b. Unfortunately the type error here is not that great because of how powerful and general this function is. Because it could potentially accept any number of arguments depending on the format string, its type is expressed in a very general way. This is a drawback of format strings to watch out for. But once you are familiar with it, it’s typically not a big problem. You just need to match up the conversion specifications like % with the actual arguments passed in after the format string.

You might have noticed that the function is defined with let shout fmt = .... It doesn’t look like it could accept ‘any number of arguments’. The trick here is that in OCaml, every function accepts only a single argument and returns either a final non-function value, or a new function. In the case of functions which use format strings, it depends on the conversion specifications, so the formal definition shout fmt could potentially turn into a call like shout "%s bought %d apples today" bob num_apples. As a shortcut, you can think of the format string fmt as a variadic argument which can potentially turn into any number of arguments at the callsite.

More reading

You can read more about OCaml’s format strings functionality in the documentation for the Printf and Format modules. There is also a gentle guide to formatting text, something OCaml has fairly advanced support for because it turns out to be a pretty common requirement to print out the values of various things at runtime.

On that note, I have also written more about defining custom formatted printers for any value right here on dev.to. Enjoy 🐫

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
-exploring-palindromic-partitioning:-solving-the-“palindrome-partitioning”-problem-

📝 Exploring Palindromic Partitioning: Solving the “Palindrome Partitioning” Problem 📝

Next Post
how-do-you-mitigate-miscommunication?

How Do You Mitigate Miscommunication?

Related Posts