Virtually all .NET code on AWS Lambda has to deal with JSON serialization. Historically, Newtonsoft Json.NET has been the go-to library. More recently, System.Text.Json was introduced in .NET Core 3. Both libraries use reflection to build their serialization logic. The newest technique, called source generator, was introduced in .NET 6 and uses a compile-time approach that avoids reflection.
So, now we have three approaches to choose from, which begs the question: Is there a clear winner or is it more nuanced?
For these benchmarks, the code deserializes a fairly bloated JSON data structure taken from the GitHub API documentation and then returns an empty response.
Newtonsoft Json.NET
This library has been around for so long and has been so popular that it broke the download counter when it exceeded 2 billion on nuget.org. The counter has been fixed since, but this impressive milestone remains!
using System.IO;
using System.Threading.Tasks;
using Amazon.Lambda.Core;
using Amazon.Lambda.Serialization.Json;
[assembly:LambdaSerializer(typeof(JsonSerializer))]
namespace Benchmark.NewtonsoftJson {
public sealed class Function {
//--- Methods ---
public async Task<Stream> ProcessAsync(Root request) {
return Stream.Null;
}
}
}
Minimum Cold Start Duration
The 4 fastest cold start durations use the x86-64 architecture and ReadyToRun. The fastest uses Tiered Compilation as well. The PreJIT option is always slower when enabled, but still makes the top 4 cut.
Architecture | Memory Size | Tiered | Ready2Run | PreJIT | Init | Cold Used | Total Cold Start |
---|---|---|---|---|---|---|---|
x86_64 | 1769MB | no | yes | no | 262.942 | 186.097 | 449.039 |
x86_64 | 1769MB | no | yes | yes | 317.328 | 151.456 | 468.784 |
x86_64 | 1769MB | yes | yes | no | 236.714 | 170.028 | 406.742 |
x86_64 | 1769MB | yes | yes | yes | 295.209 | 137.727 | 432.936 |
Minimum Execution Cost
I’ll admit, I was a bit surprised here. I would have expected ARM64 to be the obvious choice since the execution cost is 20% lower. However, that was not the case. Instead, we have a 50/50 split with x86-64 winning ever so slightly.
Also interesting is that the cheapest execution cost always uses the PreJIT option. That makes intuitively sense since this option shifts some cost from the first INVOKE phase to the free INIT phase and only has a small overhead penalty otherwise.
Similarly, Tiered Compilation is disabled for all because it introduces additional overhead during the warm INVOKE phases.
Most fascinating to me is that ARM64 is cheaper with 512 MB memory, while x86-64 is cheaper with 256 MB. This is probably just an oddity, but it serves to highlight that nothing is ever obvious and why benchmarking the actual code is so important!
Architecture | Memory Size | Tiered | Ready2Run | PreJIT | Init | Cold Used | Total Warm Used (100) | Cost (µ$) |
---|---|---|---|---|---|---|---|---|
arm64 | 256MB | no | yes | yes | 346.884 | 1598.711 | 406.117 | 26.88279408 |
arm64 | 512MB | no | yes | yes | 348.615 | 753.974 | 238.541 | 26.81680042 |
x86_64 | 256MB | no | yes | yes | 317.574 | 1186.12 | 377.718 | 26.71600553 |
x86_64 | 512MB | no | yes | yes | 314.298 | 562.768 | 234.544 | 26.84427746 |
System.Text.Json – Reflection
System.Text.Json was introduced in .NET Core 3. The initial release was not feature-rich enough to be a compelling choice. However, that is no longer the case. By .NET 5, all my concerns were addressed, and it has been my preferred choice since. Sadly, we had to wait until .NET 6, which is LTS, for it to become supported on AWS Lambda.
using System.IO;
using System.Threading.Tasks;
using Amazon.Lambda.Core;
using Amazon.Lambda.Serialization.SystemTextJson;
[assembly:LambdaSerializer(typeof(DefaultLambdaJsonSerializer))]
namespace Benchmark.SystemTextJson {
public sealed class Function {
//--- Methods ---
public async Task<Stream> ProcessAsync(Root request) {
return Stream.Null;
}
}
}
Minimum Cold Start Duration
Similar to Json.NET, the 4 fastest cold start durations use the x86-64 architecture. Unlike the previous benchmark, all of them have Tiered Compilation enabled. ReadyToRun provides an ever so slight benefit, but not much. That’s likely due to the fact that the JSON serialization code lives in the .NET framework. Same as before, PreJIT makes things slower, but it’s still among the 4 fastest configurations.
Architecture | Memory Size | Tiered | Ready2Run | PreJIT | Init | Cold Used | Total Cold Start |
---|---|---|---|---|---|---|---|
x86_64 | 1769MB | yes | no | no | 231.55 | 97.37 | 328.92 |
x86_64 | 1769MB | yes | no | yes | 276.791 | 74.063 | 350.854 |
x86_64 | 1769MB | yes | yes | no | 226.864 | 93.64 | 320.504 |
x86_64 | 1769MB | yes | yes | yes | 273.615 | 71.244 | 344.859 |
Minimum Execution Cost
Identical to the Json.NET benchmark, the 4 cheapest execution costs disable Tiered Compilation and enable the PreJIT option. Also, results are evenly split between ARM64 and x86-64.
Again, the optimal configuration uses the x86-64 architecture with ReadyToRun enabled. However, this time, all 4 optimal configurations agree on 256 MB for memory.
Architecture | Memory Size | Tiered | Ready2Run | PreJIT | Init | Cold Used | Total Warm Used (100) | Cost (µ$) |
---|---|---|---|---|---|---|---|---|
arm64 | 256MB | no | no | yes | 335.019 | 977.84 | 344.601 | 24.60815771 |
arm64 | 256MB | no | yes | yes | 330.424 | 966.123 | 347.232 | 24.57787356 |
x86_64 | 256MB | no | no | yes | 302.287 | 688.363 | 341.735 | 24.49208483 |
x86_64 | 256MB | no | yes | yes | 293.871 | 679.57 | 299.889 | 24.28108858 |
System.Text.Json – Source Generator
New in .NET 6 is the ability to generate the JSON serialization code during compilation instead of relying on reflection at runtime.
Personally, as someone who cares a lot about performance, I find source generators a really exciting addition to our developer toolbox. However, I don’t consider this iteration to be production ready, because it is missing some features I rely on. In particular, the lack of custom type converters to override the default JSON serialization behavior is a blocker for me. That said, for some smaller projects, it might be viable. My biggest recommendation here is to thoroughly validate the output to ensure any behavior changes are caught during development.
using System.Text.Json.Serialization;
using Amazon.Lambda.Core;
using Amazon.Lambda.Serialization.SystemTextJson;
using Benchmark.SourceGeneratorJson;
[assembly: LambdaSerializer(typeof(SourceGeneratorLambdaJsonSerializer<FunctionSerializerContext>))]
namespace Benchmark.SourceGeneratorJson;
[JsonSerializable(typeof(Root))]
public partial class FunctionSerializerContext : JsonSerializerContext { }
public sealed class Function {
//--- Methods ---
public async Task<Stream> ProcessAsync(Root request) {
return Stream.Null;
}
}
Minimum Cold Start Duration
This time, the 4 fastest cold starts all use Tiered Compilation and ReadyToRun. Since source generators create more code to jit, it makes sense that these options improve cold start performance, since that’s their purpose. Also, unlike the previous benchmarks, ARM64 and x86-64 are now competing for the top spot. PreJIT again slows things down a bit, but still makes it into the top 4.
Despite ARM64 finally making an appearance in the Minimum Cold Start Duration benchmark, the x86-64 architecture still secures the top two spots.
Architecture | Memory Size | Tiered | Ready2Run | PreJIT | Init | Cold Used | Total Cold Start |
---|---|---|---|---|---|---|---|
arm64 | 1769MB | yes | yes | no | 249.244 | 65.429 | 314.673 |
arm64 | 1769MB | yes | yes | yes | 276.097 | 60.221 | 336.318 |
x86_64 | 1769MB | yes | yes | no | 240.88 | 53.104 | 293.984 |
x86_64 | 1769MB | yes | yes | yes | 265.776 | 46.327 | 312.103 |
Minimum Execution Cost
The results for this benchmark are a bit more complicated to parse. For the first time, we don’t have a symmetry across options. Instead, ARM64 secures the 3 out of 4 cheapest spots. The same is true for the PreJIT option and the 256 MB memory configuration.
Similar to the Json.NET benchmark, the cheapest configurations use ReadyToRun and, as for all execution cost benchmarks, Tiered Compilation is disabled.
Architecture | Memory Size | Tiered | Ready2Run | PreJIT | Init | Cold Used | Total Warm Used (100) | Cost (µ$) |
---|---|---|---|---|---|---|---|---|
arm64 | 256MB | no | yes | no | 287.093 | 702.015 | 294.423 | 23.52147561 |
arm64 | 256MB | no | yes | yes | 311.507 | 660.822 | 295.178 | 23.38668193 |
arm64 | 512MB | no | yes | yes | 312.017 | 315.322 | 204.109 | 23.66288998 |
x86_64 | 256MB | no | yes | yes | 294.279 | 519.965 | 298.581 | 23.61061349 |
Summary
Here are our observed lower bounds for the JSON serialization libraries, as well as the baseline performance on .NET 6 for comparison. I’ve omitted .NET Core 3.1 since I don’t consider a viable target runtime anymore. However, you can explore the result set in the interactive Google spreadsheet.
- Baseline for .NET 6
- Cold start duration: 223ms
- Execution cost: 21.94µ$
- Newtonsoft Json.NET
- Cold start duration: 433 ms
- Execution cost: 26.72 µ$
- System.Text.Json – Reflection
- Cold start duration: 321 ms
- Execution cost: 24.28 µ$
- System.Text.Json – Source Generator
- Cold start duration: 294 ms
- Execution cost: 23.39 µ$
It shouldn’t be a surprise that Json.NET, which has been around for a long time, has accumulated a lot of cruft. Json.NET is truly a Swiss army knife for serialization and this flexibility comes at a cost. It adds at least 210 ms to our cold start duration and it’s also the most expensive JSON library to run.
The newer System.Text.Json library has a compelling performance and value benefit over Json.NET. It only adds 100 ms to our cold start duration and is 9% cheaper to run compared to Json.NET.
However, the clear winner is the new JSON source generator with only 70 ms in cold start overhead compared to our baseline. Cost is also 12% lower than Json.NET. That said, the lack of features may not make it a good choice just yet.
When it comes to minimizing cold start duration, the more memory, the better. These benchmarks used 1,769 MB, which unlocks most of the available vCPU performance, but not all of it. Full vCPU performance is achieved at 3,008 MB, which almost doubles the cost for a 10% improvement (source).
For minimizing cost, 256 MB seems to be the preferred choice. Tiered Compilation should never be used, but ReadyToRun is beneficial. The weird thing about this configuration is that ReadyToRun produces Tier0 code (i.e. dirty JIT without inlining, hoisting, or any of that delicious performance stuff). With Tiered Compilation disabled, our code will never be optimized further, as far as I know.
What’s Next
For the next post, I’m going to investigate the overhead introduced by the AWS SDK. Since most Lambda functions will use it, I thought it would be useful to understand what the initialization cost is.