Summary

HF Radios are important for military communications. IP is widely used and is the basis for most network communication. This paper looks at use of IP over HF Radio and the efficiency of different types of application over IP. The key findings are:

  • That IP can be operated over HF Radio, and that doing so may be useful, particularly to enable support of applications that are only available to run over IP.
  • That most IP based applications running over a HF link will make very inefficient use of the HF link, and that direct application use of the HF link using STANAG 5066 will give much better performance.

This paper concludes that applications intended for regular use over HF Radio should not use IP and should instead be directly integrated with STANAG 5066.

IP

IP (Internet Protocol) is the basis of the Internet, and the most widely used protocol. It supports a number of transports, including TCP, UDP and RTP, and a myriad of end applications. IP is the universal interface between physical networks and application. Application writers develop to IP, or to one of the standardized protocols or middleware systems that run over IP. Networking technology providers seek to provide IP operation as a primary goal.

"IP Everywhere" is such a strong culture, that it is difficult to appreciate the few places where IP is not the right answer. IP is universal and is the right answer almost all of the time. This paper looks at why IP is not the right solution for HF Radio. Another example of where IP is not right is deep space communications, where the long

Key Factors for Good Performance over HF

HF Radio is very slow with typical rates around 1200 bits per second. This slow speed is usually perceived as the primary difficulty with using HF. For components about the modem level, speed is determined. There are two general issues relating to performance that are now described.

Turnaround time

For data applications, the most significant problem with HF is turnaround time. Turnaround time is the time taken to change direction of data flow, and for HF Radio this is measured in seconds or tens of seconds. Turnaround time is caused by factors at multiple levels:

  1. Modem turnaround time
  2. Interleaver completion time. This can be a significant time (e.g., 2.16 seconds for STANAG 4539 short interleaver or 8.64 seconds for the long interleaver).
  3. Delays due to Comsec layer.
  4. The HF Radio must switch from send to receive. (An HF Radio is a simplex device, and cannot send and receive at the same time).
  5. HF Skywave latency times.

All of this leads to a turnaround time of at least a few seconds and sometimes as much as 20-30 seconds. In order to make (reasonably) efficient use of an HF link, it is critical to have transmit time longer than turnaround time. This has a significant impact on applications.

For VHF and higher frequencies, turnaround time is less, so the impact is less significant. For VHF, full duplex transmission can also be used, which removes the impact completely.

Efficient Use of the Pipe

It is important that applications use the HF link as efficiently as possible, as bandwidth is limited. There are a number of things that need to be considered:

  • Data Compression. This will be important for most applications. Compression is not discussed in this paper, but should be addressed by applications operating over HF.
  • Protocol and Header Overhead. It is important that protocols do not consume undue bandwidth with headers and other information. This should be optimized. The protocols discussed in this paper, and in particular STANAG 5066 are well optimized.
  • Grouping data. Where possible data should be sent together in long transmissions, to minimize the effect of turnaround time.
  • Avoiding repeat transmissions, except when data is lost. It is clearly important that data is not sent twice when not needed, and applications should be designed to avoid this.

These last two points have potential for significant inefficiency, and are discussed in more detail in the context of relevant protocols.

STANAG 5066

STANAG 5066 is a NATO standard for running applications over HF Radio. This is described in the Isode whitepaper STANAG 5066: The Standard for Data Applications over HF Radio.

An HF Modem provides a quite basic send/receive capability. STANAG 5066 provides application oriented capabilities over this. Capabilities of relevance to this paper:

  1. Decoupling. Data can be accepted from one or more applications, and queued for sending. A STANAG 5066 server can then “fill the pipe” to avoid wasting capacity.
  2. Fragmentation. STANAG 5066 will break application data into blocks appropriate for the modem speed (DPDUs). This is important both for precedence handling and acknowledgement.
  3. Precedence handling. STANAG 5066 will send higher priority data (DPDUs) first..
  4. Support for multicast, which can also be used to support nodes in Radio Silence (also known as EMCON (Emission Control)).
  5. Reliable transmission is supported for non-multicast traffic, by acknowledgement at the DPDU level. This means that when there is data loss (e.g., due to Radio noise) that retransmission is of the lost DPDUs. This is more efficient than application level retransmission.
  6. Minimizing turnarounds. STANAG 5066 has long transmit times (up to 127.5 seconds) and works to maximize transmit time. Acknowledgements are delayed wherever possible, so reliable transmission does not increase the number of turnarounds.

The central service in STANAG 5066 is "unit data". This service allows transfer of a block of data, with a maximum size constrained set at the STANAG 5066 level (typically 2 kBytes). Applications using

Unit data may be unacknowledged (best effort) or acknowledged (reliable). The choice is made by the application using STANAG 5066. Unacknowledged must be used where data is being sent to more than one recipient (broadcast or multicast) or where the recipient is in EMCON (Emission Control) and cannot transmit an acknowledgement.

IP over HF Radio

IP could be mapped onto HF Radio in a number of ways. There are two mechanisms standardized (STANAG 5066 and STANAG 4538) which are discussed here. These provide similar characteristics to the IP user, and so the choice of mapping does not significantly affect the analysis in this paper. It is hard to conceive of alternate mappings that would lead to improvements.

IP over HF Radio using STANAG 5066

STANAG 5066 defines use of IP over STANAG 5066. This support is mandatory in the current standard, although not all STANAG 5066 products support it. The mapping is very simple: essentially an IP packet is mapped directly onto STANAG 5066 unit data. There are two options for unit data, both of which are valid for IP:

  1. Unreliable. This is a natural mapping for IP, as IP is defined as an unreliable protocol.
  2. Reliable. This may be used for unicast and non-EMCON transmission.

There are a number of factors in the choice:

  • If data needs to be retransmitted, reliable will be more efficient, as this can be done at the DPDU level, rather than forcing the application to retransmit the complete unit data.
  • Some applications respond to lost data by either reducing transmission rate, or by packet exchange which leads to more turnarounds. These are both undesirable, and can be addressed by the reliable option.
  • A consequence of minimizing turnarounds is that in the event of data loss, IP packets using the reliable option may be considerably delayed. Some applications and operating respond badly to this delay, and data may be retransmitted unnecessarily.

This choice gives a flavor of the issues that will be examined in the rest of this paper. The support of IP over HF radio is quite straightforward. The issues arise from the interaction of applications using IP and consequences of the underlying characteristics of HF Radio.

Where priority data is carried in IP, this can be mapped onto STANAG 5066 priority. Handling of priority at the IP level is discussed in the Isode whitepaper Sending FLASH Messages Quickly.

IP over HF Radio using STANAG 4538

STANAG 4538 provides an alternate set of data link services and ARQ (acknowledgement mechanisms to support reliable data) to STANAG 5066. STANAG 4538 and associated specification is an example of 3G HF Radio. STANAG 4538 provides a reliable point to point data service that will work with very poor radio conditions. This makes it ideal for use with small “man pack” radios with whip aerials that will often need to deal with poor signals. STANAG 4538 defines a mapping of IP over its data services, which is a straightforward use of the underlying reliable data transfer. It has characteristics very similar to the reliable mapping of STANAG 5066.

STANAG 5066 and associated standards is an example of 2G HF Radio. It might be assumed that 3G is always preferable to 2G, but this is not the case. STANAG 5066 data link and ARQ will perform better than STANAG 4538 in fair and good radio conditions, so is the best choice in situations such as naval communications or strategic links where more powerful radios and large aerials can be used. STANAG 5066 also provides multiplexing, EMCON support and multicast which are not available in STANAG 4538 data services.

STANAG 4538 data link may also be used in conjunction with STANAG 5066 application integration. This will enable an application to use STANAG 5066 as the mechanism for application separation and API, and then use the STANAG 4538 data link services. This is the approach recommended by this paper to support applications over STANAG 4538.

The Example Applications

This paper looks at two example applications to analyze the performance of working with and without IP over STANAG 5066. These are:

  1. Internet messaging (message submission and transfer). IP and STANAG 5066 mappings are defined:
    • SMTP (Simple Message Transfer Protocol) is defined to work over IP.
    • STANAG 5066 Annex F defines HMTP (HF Message Transfer Protocol). This is a variant of SMTP, that provides some simple SMTP level optimizations and defines a direct mapping onto STANAG 5066.
  2. STANAG 4406 Military Messaging. STANAG 4406 defines operation over low bandwidth in Annex E, and this operates over the ACP 142 protocol. IP and STANAG 5066 mappings are defined:
    • ACP 142 defines operation over the Internet Standard UDP (User Datagram Protocol) which operates over IP.
    • ACP 142 is defined so that it can be operated over different underlying protocols. STANAG 4406 Annex E defines operation directly over STANAG 5066.

SMTP gives a good insight into a TCP based application, which is a common choice for Internet applications (it is not used for real time applications such as voice and video, but is used for most other applications). ACP 142 was designed for use over HF Radio, and is a good example of a rate based approach.

As well as being important applications in their own right, the analysis shows a number of key issues in the way the various protocol combinations work. This paper looks to highlight key protocol choices, rather than give detailed descriptions. Many of the protocols examined in this paper are presented in a highly simplistic manner, in order to make clear the major features and in particular characteristics that will impact performance.

Data Streams & Internet Messaging

This section looks at the underling mapping of Internet Messaging (SMTP over TCP and HMTP direct over STANAG 5066).

TCP

In order to analyze performance, it is important to understand (in a very simplistic way) how TCP works in conjunction with IP. In an IP network there are no end to end circuits or virtual circuits. IP routers simply switch packets. When congestion occurs, the routers drop packets.

TCP sends data packets in order, and gets back an acknowledgement (ACK) for each data packet. If it does not get an ACK back within a reasonable time, it resends the data packet. Both data packets and ACKs are carried by IP, and data flows in both directions to support TCP. TCP is a windowing protocol; The receiver specifies in ACK packets how many bytes ahead of the packet referenced by the ACK that the sender may send to. The window is the mechanism used by TCP to control how fast it goes. When packets are lost, a TCP receiver will reduce (close down) the window, which will in turn reduce the data rate. This “fair” behavior of TCP is a key element of controlling congestion in the Internet.

The sequence below shows how TCP with data flowing from client to server would map onto IP packets:

Client: TCP SYN ->
Server: <- TCP SYN,ACK
Client: TCP SYN & DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: FIN,ACK ->
Server: <- TCP RST,ACK

If mapped onto a high latency link, the order of packets may change as follows:

Client: TCP SYN ->
Server: <- TCP SYN,ACK ** Turnaround 1
Client: TCP SYN & DATA -> ** Turnaround 2
Client: TCP DATA ->
Client: TCP DATA ->
Client: TCP DATA ->
Server: <- TCP ACK ** Turnaround 3
Server: <- TCP ACK
Server: <- TCP ACK
Server: <- TCP ACK
Client: TCP DATA -> ** Turnaround 4
Server: <- TCP ACK ** Turnaround 5
Client: FIN,ACK -> ** Turnaround 6
Server: <- TCP RST,ACK ** Turnaround 7

The key change to note is that multiple DATA packets are sent before the associated ack comes back. The above diagram notes how the TCP data flow causes turnarounds:

  • Turnarounds 1 & 2 are associated with opening the TCP connection.
  • Turnarounds 3 & 4 are associated with the TCP window, requiring an ACK before more data can be sent.
  • Turnaround 5 is used to ensure that the last data is received by the application before closing the connection.
  • Turnarounds 5 & 6 are associated with closing the connection.

It can be seen that if a suitable size of window is chosen, ongoing data transfer can be mapped efficiently onto the underlying HF data exchange. There are quite a few turnarounds associated with open and close, so TCP would not be efficient for short interactions.

A key benefit of using TCP (over IP) is that a wide range of applications "out of the box".

A basic problem with TCP and HF, is that most "out of the box" TCP implementations are tuned for much faster networks than HF. A good TCP implementation will adapt its settings to network conditions, but this will be likely to have a performance cost in extra turnarounds

  1. Window is too small. This will cause the sender to wait for ACKs too soon, and have too short a transmit time before the turnaround.
  2. Window is too large. This will cause the sender to transmit more packets than can be queued (in the various queues between the sender and the STANAG 5066 system) and IP packets will get dropped.
  3. Packets retransmitted without need. This is most likely to be a problem early on, before the sender has an accurate measure or round trip time.

In summary, efficiency will depend significantly on the TCP implementations, and how they react to the network characteristics. TCP will be a poor choice for short connections and for "chatty" applications. It should give reasonable performance for long lived TCP connection with steady data flows that can map onto the optimal "two minutes "HF model.

If a reliable IP mapping of IP onto STANAG 5066 or IP over STANAG 4538 is chosen, it is important that TCP is tuned so that it does not cause turnarounds or retransmissions as a consequence of the long delays. If data is lost with an unreliable mapping, additional turnarouds are a likely consequence.

SMTP & Turnarounds

The core SMTP protocol is "chatty". The sending implementation provides data (e.g., a recipient email address) and then waits for the receiver to accept or reject the address. This leads to many application

In an HF environment, this is highly undesirable application behavior, as it requires lots of turnarounds in order to support it.

The turnarounds of SMTP and TCP are illustrated below, with each line representing an IP packet. The protocol exchange was taken from a real interaction from Microsoft Outlook to Isode’s M-Switch server, using Microsoft’s secure authentication.

Client: TCP SYN ->
Server: <- TCP SYN,ACK
Client: TCP SYN ->
Server: <- SMTP 220 (Welcome)
Client: SMTP EHLO ->
Server: <- TCP ACK
Server: <- SMTP 350 (OK)
Client: SMTP AUTH NTLM (Authenticate Session) ->
Server: <- SMTP 334 (Intermediate Authentication Response)
Client: SMTP Authentication Data ->
Server: <- SMTP 334
Client: SMTP Authentication Data ->
Server: <- SMTP 235 (Authentication successful)
Client: SMTP Mail From (Message Sender) ->
Server: <- TCP ACK
Server: <- SMTP 250 (OK)
Client: SMTP RCPT TO (Recipient) ->
Server: <- TCP ACK
Server: <- SMTP 250 (OK)
Client: SMTP DATA (Ask if ready for the message) ->
Server: <- TCP ACK
Server: <- SMTP 354 (Go ahead)
Client: SMTP Message Body ->
Server: <- TCP ACK
Client: SMTP EOM ->
Server: <- TCP ACK
Server: <- SMTP 250
Client: TCP ACK ->
Client: SMTP QUIT ->
Client: TCP FIN,ACK ->
Server: <- SMTP 221
Server: <- TCP RST,ACK

This exchange is shown at the IP level. Note that this exchange was to transfer a very short message, which fitted into a single IP packet. There are 23 changes of direction, which would require 23 turnarounds. This is very inefficient.

The TCP choice to map onto IP packets has an effect on performance. The above scenario could be optimized by sharing IP packets. The exchange above is optimized for a fast low latency network.
Server to Server communication with a modern SMTP implementation making full use of pipelining would need fewer packets, but the number of turnarounds remains significant. A good SMTP implementation such as M-Switch might give the following, assuming no authentication and an efficient mapping to IP:

Client: TCP SYN ->
Server: <- TCP SYN,ACK
Client: TCP SYN ->
Server: <- SMTP 220 (Welcome)
Client: SMTP EHLO ->
Server: <- TCP ACK + SMTP 350 (OK)
Client: SMTP Addressing Information ->
Server: <- SMTP Address Response
Client: SMTP Message ->
Server: <- SMTP Message Response
Client: TCP FIN,ACK + SMTP QUIT ->
Server: <- TCP RST,ACK SMTP 221

It can be seen that this is much better than then previous example, with just eleven turnarounds, but still has a significant turnaround overhead.

HMTP Direct Mapping to STANAG 5066

HMTP contains two key differences with SMTP. The first is that it defines a mode of operation very similar to standard SMTP pipe-lining that minimizes number of turnarounds (to two for a small message). It also fixes options to maximize interoperability without the need for service negotiation.

The key feature of HMTP is that it defines a mapping onto unit data with ARQ. The ARQ is essential, as HMTP cannot deal with data loss. For a message that fits into three MTUs, this would lead to the following sender/receiver interaction:

Sender: Data (HMTP Commands & Message) ->
Sender: Data (HMTP Commands & Message) ->
Sender: Data (HMTP Commands & Message) ->
Receiver: <- ARQ (for 3 Data)
Receiver: <- Data (HMTP Response)
Sender: ARQ ->

It can be seen that this is very efficient, with just two turnarounds needed (where the message can be transmitted within 127.5 seconds). The mapping of a TCP based application directly onto STANAG 5066 offers significantly better performance..

ACP 142

ACP 142 is a protocol for supporting Multicast and EMCON transmission of data, over HF and other networks. STANAG 4406 Annex E uses ACP 142 by providing it with a single compressed file to transfer to one or more destinations. This is described in the Isode white paper Military Messaging over HF Radio and Satellite using STANAG 4406 Annex E.

How ACP 142 works

ACP 142 works by dividing the data to be transferred into multiple packets. It sends out each packet in turn, to unicast or broadcast addresses. Each recipient will inform the sender of any missing packets (so that they can be re-transmitted) and at the end tells the sender that it has all of the packets. Key features of ACP 142:

  • It uses an unreliable datagram service, and does not require low level acknowledgements.
  • There is no windowing mechanism to control data transfer rate.

A key issue to address with underlying mappings is how to optimize data transfer rate.

ACP 142 direct over STANAG 5066

ACP 142 has a straightforward mapping onto the unit data service of STANAG 5066, using the unreliable (non-ARQ) option.

ACP 142 provides functionality to support multicast and EMCON transmissions. It also provides an “optimal” transfer of data to a single recipient. The following sequence is directly comparable to the HMTP sequence shown earlier:

Sender: Data (ACP 142 Addressing Information) ->
Sender: Data (ACP 142 Data) ->
Sender: Data (ACP 142 Data) ->
Sender: Data (ACP 142 Data) ->
Receiver: <- Data (ACP 142 Ack)

It can be seen that there is only a single turnaround, after the data has been sent (as ARQ is not used). If data is lost, this will be handled by the ACP 142 protocol. This uses the absolute minimum of turnarounds for reliable data transfer.

ACP 142 handles loss of intermediate packets, so if a packet is lost at the modem level, only the lost packet needs to be retransmitted.

Rate control with STANAG 5066 is straightforward, and is handled by the SIS protocol. The application can send data as fast as it wishes using the unit data service. If the STANAG 5066 SIS server has too much data from the sending application (or from another application) it can request the application to stop sending. It will inform the application later when it can send data again. Because the SIS server interacts with the modem and handles all of the data being sent, it has all the information necessary to optimize use of HF link. The flow control enables the ACP 142 implementation to send data at the optimal rate.

ACP 142 over IP

ACP 142 maps cleanly onto IP using the User Datagram Protocol (UDP), which is a simple application service layered directly on IP. When IP is mapped without ARQ, the result is very similar to the direct mapping to STANAG, with a small overhead of the UDP and IP headers.

A transmitting ACP 142 application needs to control the rate at which it sends out packets, and when using IP there is not protocol mechanism to achieve this. The approach adopted by implementations we are aware of is to configure a rate at which packets are sent out. This value will be set to match the underlying (HF) network. Setting the rate needs care:

  • If the rate is set too low, bandwidth is wasted.
  • If the rate is set too high, IP packets will get dropped. This will cause ACP 142 to retransmit in response to missing packet information from receivers. This will be inefficient.

The difficulty is that the rate for a real system will be variable, and the application has no mechanism to determine this rate. The HF bandwidth available may change according to conditions. The link may be shared with other STANAG 5066 applications (which may have higher precedence), and their use will not be visible.

Some implementations use ICMP Source Quench to provide flow control. This works to some extent, but has a number of problems:

  • Source Quench is not allowed for multicast addresses. This is a problem as ACP 142 is often used for multicast destinations.
  • The Source Quench may be generated after packet discard, and the ACP 142 implementation cannot determine whether or not to retransmit.
  • The Source Quench is a crude “slow down” signal. It does not indicate how much to slow down or when or if it is safe to speed up again.

In summary direct mapping of ACP 142 to STANAG 5066 gives better performance than use of IP, due to the ability to provide optimal control of data rate.

ACP 142 vs Data Stream

Although this paper is intended to primarily consider the use of IP with HF Radio, it also provides a useful comparison between ACP 142 and Data Stream mechanism such as the one used by HMTP for carrying "bulk data". Where it is available as an option, ACP 142 has

  1. It supports multicast transmission.
  2. It supports EMCON transmission.
  3. For transfer to a single recipient that is larger than the MTU size, it requires fewer turnarounds, and is likely to give better performance.

Analysis of IP over HF Radio

The paper has looked at two specific applications, and compared operation with and without use of IP. This section looks at how this specific comparison applies more generally. Reliable applications fall into two major classes according to how data transfer rate is controlled: rate based and window based. These are now considered.

ACP 142 is a good example of a rate based protocol. The key performance issue with a rate based protocol and HF is getting the rate correct. This is straightforward when using STANAG 5066 directly, and not possible when using IP. For rate based protocols, using IP is undesirable.

TCP is a windowing protocol, and in practice the only one that matters. They key issue for TCP is to minimize turnarounds. For a chatty application such as SMTP, the cost of turnarounds is prohibitive. A TCP based application could give reasonable performance if the following factors are all present:

  • A long lived application (so the TCP start/stop overhead can be amortized).
  • Stable HF data rate (i.e., no change in modem speed, or other applications sharing the link), to avoid overhead and turnarounds of adjusting the window.
  • Low transmission errors (as these will lead to extra turnarounds).
    These conditions are quite restrictive, so in most situations, a direct mapping onto STANAG 5066 is going to give much better performance than TCP over HF Radio.

Another class of application is based on unreliable communication, typically using the User Datagram Protocol (UDP). In situations of low packet loss, a UDP based approach is in practice reasonably reliable. It would be sensible to use a reliable mapping of IP onto STANAG 5066 or STANAG 4538 in support of such protocols, to minimize packet loss over the HF link. This sort of application is only going to be useful for low volumes of data. A key problem is that there is no feedback to the application if more data is sent than can be handled by the link. Unreliable communication is only sensible for a specialized class of applications, as reliability is generally desirable.

HF Links are also used for voice communication. This is generally given priority over data, and data transfer cannot generally co-exist with voice. This means that voice usage can interrupt data, and needs to be taken into account by data applications.

Further Reading

Some useful papers relating to this are available online:

There are also several papers in the IEE 9th International Conference on HF Radio Systems and Techniques:

  • IP over HF as a bearer service for NATO formal messages
  • Bowman HF IP network solution
  • IP traffic over STANAG 5066

With some of these papers, it is important to read the details and look at the numbers, which are all in line with the findings in this paper. Some papers are written to look at use of IP, and summarize how it works, and sometimes performance is reasonable. This is not inconsistent with the technical conclusion here that it works, and performance varies from atrocious to sub-optimal.

Conclusions

This paper has shown quite clearly that direct use of STANAG 5066 will be substantially more efficient that that the use of IP. The consequences of this are:

  1. That IP can be operated over HF Radio, and that doing so may be useful, particularly to enable support of applications that are only available to run over IP.
  2. That most IP based applications running over a HF link will make very inefficient use of the HF link, and that direct application use of the HF link using STANAG 5066 will give much better performance.

This paper concludes that applications intended for regular use over HF Radio should not use IP and should instead be integrated with HF Radio using STANAG 5066.