Three ways to send large amounts of data over HTTP

Three ways to send large amounts of data over HTTP

In the early days of the web, people sent files that were just a few KB in size. Fast forward to 2023, and we enjoy high-resolution MB-sized images and watch 4K (soon to be 8K) videos that are several GB in size.

Even with a good internet connection, downloading a 5GB file can still take some time. If you own an Xbox or PlayStation, you know the feeling.

There are three ways we can reduce the time it takes to send large amounts of data over HTTP:

  • Compressing Data
  • Sending chunked data
  • Request data in the selected range

They are not mutually exclusive. You can use all methods together depending on your use case.

Compressing Data

1*_un0bHBemgCSDocQmucK5Q.png

To compress data, we need a compression algorithm.

When sending a request, the browser includes a header called Accept-Encoding, which contains a list of supported compression algorithms, including gzip (GZIP), compress, deflate, and br (Brotli).

Next, the server selects an algorithm it supports from the list and sets the algorithm name in the Content-Encoding header.

When the browser receives the response, it knows how to parse the data in the body.

Among these algorithms, the most popular is GZIP. It is an excellent choice for compressing text data such as HTML, CSS, and JavaScript.

Brotli is another algorithm worth mentioning. It performs even better than GZIP in compressing HTML.

These efficient algorithms have some limitations.

They compress text well, but not well enough for compressing images or videos. After all, the media is already optimized.

Try compressing a video file on your computer. You should hardly see much difference between before and after compression.

Furthermore, it is almost impossible to compress a 5GB video to a few KB without losing quality.

Compression is good, but we need a better solution - send the file in chunks and assemble the partial data on the client side.

Sending chunked data

1*0WLNkzfgw9faLpTUXkk3tg.png

In version 1.1, HTTP introduced chunked data to handle large data situations.

When sending the response, the server adds a header Transfer-Encoding: chunked to let the browser know that the data is transferred in chunks.

1*Nwlp0QqhEsvWl4fw-x0X7Q.png

Each chunk has the following components:

  • A length block marker, marking the length of the current block data
  • Chunking data blocks
  • CRLF delimiter at the end of each chunk

Want to know what CRLF is?

1*s_-5lmT9176ymCAaaGCE2w.png

CR followed by LF (CRLF, \r\n, or 0x0D0A) moves the cursor to the next line and then to the beginning of the line. You can find more details in the Further Reading section at the end of this article. Here, you can simply think of it as a delimiter.

The server continues to stream chunked data to the browser. When it reaches the end of the data stream, it appends a closing tag containing the following:

  • A length block, number 0, and CRLF at the end
  • An extra CRLF

On the browser side, it waits for all the chunks until the end marker is reached. Then, it removes the chunk encoding, including the CRLF and length information.

Next, it combines the chunked data into a whole. Therefore, on Chrome DevTools, you can only see the assembled data, not the chunked data.

Eventually, you will receive a chunk of the entire data.

1*oChWIlysG3PQD3vy8ctVxw.png

Chunking the data is useful. However, for a 5GB video, it still takes some time for the complete data to arrive.

Can we fetch selected chunks of data and request other chunks when needed?

HTTP says yes.

Request data in the selected range

1*LOGONes_KpmSN6zXaz9DhA.png

Open a video on YouTube and you'll see a gray progress bar moving forward.

What you just saw is YouTube requesting data for the selected range.

This feature allows you to jump anywhere in the timeline. When you click somewhere on the progress bar, the browser requests a specific range of video data.

Implementing range requests on the server is optional. If implemented, you can see Accept-Ranges: bytes in the response header.

1*MWd4AGP8lLRIQw5mketXew.png

This is an example of a YouTube request. You can find this header in any "playback" request.

A range request header looks like `Range:bytes=0-80`, which is indexed starting from 0.

This head is a very cleverly designed head with excellent flexibility.

Assume that a data has a total of 100 bytes.

  • Range: bytes=20 requests a range starting from 20 to the end, which is equal to Range: bytes=20-99.
  • Range: bytes=-20 requests the last 20 bytes of data, which is equal to Range: bytes=80-99.

If the requested range is valid, the server sends a response with a Content-Range header verifying the data range and total length, for example Content-Range: bytes 70-80/100.

Range requests are widely used in video streaming and file download services.

Have you ever continued a file download after an internet outage? That's a range request.

Additionally, range requests support multiple ranges.

For example, you can request two ranges from a file, like Range: bytes=20-45, 70-80.

A multi-range body looks similar to chunked data. Each chunk has the following parts:

  • A boundary block, marking the boundary of different data blocks, starts with -- and ends with CRLF
  • Two headers, Content-Type and Content-Range, show the properties of the corresponding data block and end with CRLF
  • An extra CRLF to tell the client that real data is coming
  • Finally, a data block terminated by CRLF

The boundary is just a random string that looks like 3d6b6a416f9b5, marking the boundaries between different chunks of data.

Finally, the body ends with a boundary block, which starts with -- and ends with -- and CRLF. This tells the browser that the multipart has ended.

Let's put it all together. The response body is structured as follows.

Summarize

HTTP helps us to transfer large amounts of data through compression, chunked data, and range data.

The idea here is to send the data we need when we need it, and then send other data when needed. You can try the same idea when you encounter problems in designing similar systems.

By combining these three methods, we can send compressed chunked data range data.

<<:  Huawei releases a full range of 5G-A solutions to make 5G-A a reality

>>:  Transforming the digital experience with 5G

Recommend

Ericsson and Swisscom sign standalone 5G network agreement

Ericsson and Swisscom have signed an expanded 5G ...

What is Zigbee? Why is it important for your smart home?

Smart home connections include not only familiar ...

Ten techniques for API protocol design

In this digital age, our daily lives are filled w...

Knowledge literacy in the 5G era: Understanding the Internet of Things

What is IoT The Internet of Things (IoT) is abbre...

Which parameters need to be tuned to support millions of long connections?

File descriptor limits System-level limit: The op...

How edge computing and 5G can drive business applications

Over the past decade, advances in cloud computing...

Three reasons to build a converged Wi-Fi and IoT network

In the near future, collaboration between humans ...

How eSIM is revolutionizing wireless technology

Embedded Subscriber Identity Module (eSIM) has gr...

OlinkCloud: $4/month KVM-1GB/10G SSD/500GB/Germany

Olink.Cloud is said to be a site under the hostin...

Experts give reasons for slow 4G network speed: too many users and bloated apps

Do you feel that the current 4G network speed is ...

In 2024, the core network will usher in new opportunities!

In today’s article, let’s talk about the core net...

There are five main differences between RS232 and RS485

Many communication protocols are often used in em...