Manuel D. Let us take a look at the limit. They are a bit arbitrary seeing where we hit them. I think the 4M char we estimate is about 1M tokens. Can you estimate where your web page is in Char or tokens?
it's base64 chars, but that's a limitation of the underlying in the gRPC layer , or atleast the 4M matches what i think is the default limit. For our internal networking we ended up disabling grpc size lengths (I think that brings it up to 100MB for a single message) by setting rpc.max_send_message_length, grpc.max_receive_message_length to -1. For arize we just chunk the page for now, it's not a definitive solution as i'm sure you might now want host arbitrary binary blobs for us but we have a short term milestone coming up.
I think your limit is valid, other systems have similar ones: https://docs.wandb.ai/guides/track/limits#value-width
