Optimization of the Low‐Speed Link Throughput for Voice Services

The weak point of the field network is represented by E12 low‐speed relay links. The situation requires savings of the bandwidth. For this purpose, the necessary bandwidth for voice transmission up to the level of the physical layer has been analyzed. As a suitable solution of network throughput increase, we propose the change of a Voice Payload Size. After finding the optimal Voice Payload Size, tests were carried out to verify the impact of the Voice Payload Size on call quality. A packet loss rate is chosen as a parameter for modelling real traffic. To obtain the expected voice quality, the E‐model according to the G.107 Recommendation is applied. The theoretical results are compared with practical measurements, when using the proprietary Cisco Packet Loss Concealment algorithm to evaluate the voice quality. The results obtained have confirmed the suitability of the solution.


Introduction
Nowadays, the Army of the Czech Republic has a modern battlefield communication network.Keeping in line with current trends in the world of communications, the backbone battlefield network transfers all types of information using an Internet Protocol (IP) protocol.Having overcome some initial problems, the modernization brought a stable, unified voice and data network, enabling the use of advanced com-munication services.With respect to a specific character of the network, it is essentially beneficial to be able to securely encrypt the data and voice traffic.
The basic building elements of the backbone of the battlefield network is represented by the commercial Cisco devices -2 800 / 2 900 and 3 800 / 3 900 series routers -equipped with the Cisco CallManager Express (CCME) for provision of the voice services based on Voice over Internet Protocol (VoIP).Another, equally important, element of the network is created by external data encryptors.In our work, we assume the use of the commercial devices, using the Internet Protocol Security (IPsec) in the tunnel mode.The particular communication nodes are interconnected by low-speed radio-relay links, providing the transparent transmission of the E12 interface.A simplified diagram of the backbone battlefield network is shown in Fig. 1.

Fig. 1 Simplified battlefield network topology
Modern network allows transferring a lot of useful real-time information from the battlefield to the command post and vice versa.However, these new applications (e.g.video streaming) require sufficient bandwidth.From this point of view, the E12 interface with a transmission capacity of 2 Mb/s seems to be the weakest point of the network.The aim of our work is to find a more efficient way of the use of transmission media for transmission calls under reduction of overhead information by setting an appropriate size of voice payload size.
In our previous work, we dealt with the issue only on the 3-layer Transmission Control Protocol / Internet Protocol (TCP / IP) model [1].There are options to streamline the transmission of voice other than by changing the voice payload size.Publications, such as [2,3], deal with the issue of more efficient transport of voice, using a data aggregation.The article [4] describes options to streamline data transfer on the tactical level by header compression technique.The paper [5] provides voice capacity analysis in Voice over Wireless Wide Area Network (VoWLAN), while the authors of [6] have even suggested a new transport protocol for the transmission of VoIP services, having a significantly lower overhead.In addition, the authors of [7] have developed a VoIP codec that can be used under a very severe communication conditions.
These methods would require substantial changes in the network configuration or changing or supplementing the hardware (HW) of the network nodes.In contrary, the change of the voice payload size can be done very easily by configuring the routers.
The article analyzes the behaviour of a specific network with encrypted transmission of voice data at the level of end network nodes.Unlike previously published works, it considers the effect of using the specific communication interface at 2 048 kb/s (E12), the Cisco High-Level Data Link Control (cHDLC) protocol and the High-Density Bipolar Order 3 (HDB3) line code.
The organization of this paper is as follows.In Section 2, we derive the required bandwidth of one voice channel with regard to the specifics of the network (data encryption and cHDLC protocol used).Section 3 focuses on tests carried out in order to verify the calculated maximum throughput of E12 connection for VoIP traffic.In Section 4, the influence of packet loss rate on voice quality is tested for the chosen value of Voice Payload Period (VPP) 60 ms.Finally, Section 5 summarizes the results and concludes the paper.

Voice Payload Period and Bandwidth Requirement
Nowadays, unlike conventional digital telephone networks, VoIP technology allows communicating parties to negotiate parameters used for voice transmission during the call setup.This is an important issue because these parameters determine the voice quality, as well as the network bandwidth requirements.The required bandwidth is affected by the type of codec used, Voice Payload Size (VPS) and by possible use of the Voice Activity Detection (VAD) technique.On the problematic segments of the transmission path, the bandwidth can be further reduced by header compression.
It is possible to use a wide range of types of codecs.Moreover, each of them can define the Codec Sample Interval (a sample interval at which the codec operates -CSI) and the Codec Sample Size (a number of bits on the coder output at each codec sample interval -CSS).The voice coder has a constant Codec Bit Rate (CBR) corresponding to Eq. (1) at its output.

CSS CBR =
. (1) When using a codec with the lower CBR, the voice quality is reduced.Thus, each codec is evaluated using a Mean Opinion Score (MOS) scale.In the researched network, a G.729 codec is used.In terms of the CBR output of 8 kb/s, the MOS value of 3.9 and common implementation, the G.729 appears as optimal in the studied network.
The VPS parameter determines how many bytes from the voice coder will be encapsulated into the packet payload field.The size of VPS is expressed in bytes and can vary from codec to codec.Some publications state VPS size in milliseconds which we consider confusing.We use the term Voice Payload Period instead that specifies the time length of the voice segment which is transmitted in one packet.Of course, the VPP must be a multiple of CSI.The dependence of VPS and VPP is expressed by the following the relationship (2).
Since VoIP transport is delay-sensitive, small VPP is preferred in order to ensure high quality of services.E.g. processing of a fax signal using VoIP principles strictly requires the VPP size of 10 ms.If a small VPP is used, the overhead is often much larger than the size of the VPS.If the encryption is used and thus overhead increases, an overhead to VPS ratio is even greater.Voice transmission in terms of bandwidth utilization is then very inefficient.
Another possibility to reduce the demands on the bandwidth is the implementation of the VAD technique.However, application of this technique brings problems [8].The active VAD can result in losses of beginnings of call segments which greatly decrease the quality of voice services.The specificity of the examined network is its use in battlefield conditions.Standard techniques and algorithms of the VAD might not function properly here due to load and diverse background noise.For the above mentioned reasons, the VAD technique is neither used in the battlefield of the Czech Armed Forces (CAF) nor in networks of most operators.
The analyzed network utilizes point-to-point communication to connect individual nodes.To get the bandwidth savings, it appears appropriate to use compressed Internet Protocol / User Datagram Protocol / Real-time Transport Protocol (IP / UDP / RTP) headers, e.g. according to Request for Comments (RFC) 2 508.However, it is not possible due to the utilization of an external encryptor.Another potential difficulty of the implementation of header compression is solving errors concerning the loss of transmitted packets.
Based on the analysis performed concerning possibilities of streamlining the voice transmission in the researched network, we further focus on the change of the VPP or rather on the improvement of the ratio of an overhead to the VPS.Changing this parameter does not require any hardware changes, it only affects the process of packetization and depacketization of voice at the level of end phone devices.In addition, enlargement of the VPP has a positive impact not only on the bandwidth, but also on the utilization of node elements.In case of adverse circumstances (a packet loss, an excessive delay), enlargement of the VPP can have a negative impact on the voice quality.
In theoretical calculations of bandwidth required for one call, it is necessary to know particular operating technology, including protocols employed in each level of the communication model.For general calculation of bandwidth, we applied the following formulas: , where B represents the bandwidth required, TPS expresses the total packet size and PPS indicates the number of packets per second.The total of bits needed to transfer a packet at the physical level can be calculated according to the formula: where L2H represents the overhead of layer 2 header, OH is overhead of IP / UDP / RTP headers plus the overhead of encryptor, VPS expresses the voice payload size and LLF (link layer factor) indicates the increase of the bandwidth due to the utilization of a link layer protocol.The value of packet per second depends on the VPP and can be calculated according to the following formula: From the parameters needed to calculate the bandwidth, the size of the L2H and the LLF are unknown.The overhead of the encryptor equals to 20 B in case of the examined network.

Overhead of cHDLC Protocol
The Cisco routers used in the researched network have the E12 interface set in WAN Interface Card (WIC) mode allowing simultaneous transmission of voice and data in the form of VoIP.At the link layer, the HDLC protocol is used.However, the Cisco devices do not comply with the general standard ISO / IEC 13 239.They use their own proprietary Cisco HDLC protocol (cHDLC).In order to determine the size of the L2H with certainty, we have analyzed the real communication between two routers.
Signal samples have been obtained from the E12 interface by a digital radio receiver IZ225 of INTRIPLE, a.s.The receiver samples an input signal with the speed of 110 MS/s and it stores the samples in Random Access Memory (RAM).Configuration of the receiver used allows storing a recording of about 1.2 sec to the memory of the receiver.The recording obtained is subsequently processed in an off-line mode on a common PC using CipherCAD software tool.After decoding the HDB3 code, a binary sequence is obtained.It is shown in Fig. 2 in the form of digital bitmap.

Fig. 2 Decoded signal plotted by frames
Black dots represent number 1, white dots 0. The lines of bitmap need to be read from the left to the right and from the top to the bottom.One full line represents exactly 256 bits, i.e. one E12 frame according to G.704 Recommendation.One frame takes 125 µs.Each frame is divided into 32 time slots.The time slots are numbered from 0 to 31.It is evident from Fig. 2 that the zero time slot is reserved for transmission frame synchronization.Individual bits are used according to the G.704 Recommendation.Regular alternation of 1 and 0 is evident in the second bit.The zero time slot of even frames contains 0011011 synchronization sequence.The regular alternation of 1 and 0 occurs also in position of the 6th bit which has again the value equal to 0 at the even frames.
The remaining time slots (1 -31) are used for data transmission with data rate 1 984 kb/s.Data are transmitted in cHDLC frames.The flag 0111 1110 serves for the separation of frames.The E12 interface operates in a synchronous mode of data transfer.The idle data link sequence is a continuous repetition of the flag 0111 1110.Regular repetition of the flags is shown as a regular alternation of a wide black stripe and a narrow white stripe on the bit map.If the cHDLC frame is transmitted, the data on the line are random.In Fig. 2, this situation is shown twice, i.e., two cHDLC frames are transferred in the displayed section.
Subsequently one cHDLC frame is selected and its structure is investigated.Fig. 3 shows the entire cHDLC frame written in hexadecimal form.

Fig. 3 Captured frame of cHDLC protocol
Researching the frame demonstrates that the cHDLC protocol really uses a proprietary frame structure.The frame starts by a control byte that has a value of 0000 1111 (the value expressing the unicast).Then it follows a control field with a value of 0000 0000 and a type of protocol field with a value of 00001000 00000000.The type of protocol field informs that the IP protocol is wrapped in the frame.The data part is followed by two bytes carrying the information about the checksum.The frame structure is shown in Fig. 4.
The transmission on the link layer is neither connected, nor reliable.Only the data frames (frames of I type) with the necessary overhead of 6 bytes are sent.Another special byte (flag) is used for separating the frames.The L2H size on E12 connections is therefore 7 B in the analyzed network.

Fig. 4 Frame structure of cHDLC protocol
Optimization of the Low-Speed Network Performance for Voice Services 67 A line protocol or a line code used may cause further increase in the real bandwidth.This phenomenon reflects the LLF parameter.In the analyzed network, the HDB3 line code is used which causes no increase in the bandwidth.However, the cHDLC protocol is a bit-oriented protocol.Each frame is separated by a flag.In order to uniquely identify the start and end of frames, it must be ensured that the flag cannot occur in the frame.This situation is treated by a transmitting party so that if the sequence of the five ones occurs within the frame, the transmitting party always inserts zero behind it.The receiving party then removes zero.This procedure is called a bit stuffing.Depending on the character of transferred data (number of sequences of five ones), the bit stuffing may cause slight increase in the required bandwidth.Borderline case would do the frame consisting only of ones.In this case, the value of the LLF is equal to 1.2.
In order to determine the size of the LLF, the following experiment is performed.We catch a total of 11 392 packets carrying voice encrypted by IPsec.We perform a statistical evaluation of the frequency of the five ones above the IP header and encrypted data (20 + 140 B).The histogram is shown in Fig. 5.
The mean value and median are of the same size of 19.If we take into account the effect of the cHDLC headers, the pattern of the five ones can only occur in the checksum field three times at most.Statistically, the reserve of 22 bits should be sufficient.However, since the transmission of unencrypted data or an IP address used (e.g.31.31.31.31) can significantly increase the number of stuffings, we chose the reserve of 32 bits.The LLF value is equal to the ratio of the frame length after and before bit stuffing.

Calculation of Bandwidth
After specifying all necessary parameters, we calculate the required bandwidth depending on the size of the VPP.The VPP can vary from 10 to 240 ms by 10 ms steps on the Cisco routers.The VPP = 20 ms is used as a default size.Considering the G.729 codec, the overhead of the encryptor of 20 B, the cHDLC protocol of 7 B and the LLF = 1.025, we obtain the values listed in Fig. 6.The straight line "Codec bit rate" shows a constant bit rate at the output of the encoder (in our case 8 kb/s).The curve "IP layer" determines the required bandwidth necessary for the transmission at the network layer.So only RTP, UDP and IP headers are considered.The curve "Physical layer" defines the necessary bandwidth required for the transmission of one call at the physical layer in our analyzed network.From the graph, a significant impact of the size of the VPP on the required bandwidth is evident, especially at low levels.It does not bring any significant savings at the value higher than VPP = 100 ms.
The higher value of the VPP than 60 ms increases the round trip delay above the value established by Recommendation ITU-T P.1010.Therefore, we will continue to work with the optimal value of VPP = 60 ms.

Experimental Results
The results obtained in the previous chapter have been verified by a simple experiment.We connect two Cisco routers using E12 connection and read utilization of the connection during a call using the show interface command.The measurements are performed for all values of the VPP allowed.The values obtained are represented by curves in Fig. 7.

Fig. 6 Dependence of bandwidth on VPP
Since we have no encryptors available, we calculate the required bandwidth for the case without any encryptors.The values obtained are represented by the curve "Calculated".The next curve "Calculated without flag and Cyclic Redundancy Check (CRC)" corresponds to the calculated values without considering the impact of bit stuffing, the flag necessary to separate the frames and two bytes of the Frame Check Sequence (FCS).
Thus the curve "Calculated without flag and CRC" stands for the bandwidth measured on the E12 interface controller.Although Internetwork Operating System (IOS) of router rounds measured utilization to units of kb/s, the curve "Measured" and curve "Calculated without flag and CRC" are identical and confirm the accuracy of our calculation.The curve "Calculated" presents the actual bandwidth required for the transmission at the physical layer.

The Impact of Packet Loss on Voice Quality
The proposed network improvement (VPP optimization) works reliably in the event of an error-free transmission.In a real network, packet losses can occur.Non-delivery of packets during the transmission may be caused by a number of factors.One group of factors is caused by the behaviour of the network under specific conditions.These include in particular the reaction of HW to failure of a line (Spanning Tree Protocol, time of convergence of routing protocols).This situation leads to formation of queues and dropping of packets.In such cases the non-delivery of packets occurs usually for a longer time, and the effect on speech quality is therefore identical when using both the VPP 20 ms and the VPP 60 ms.
The second group of factors includes errors existing due to the properties of a physical layer (thermal device noise, interferences, noise, crosstalk on the cable, etc.).These effects can never be completely eliminated.They are reflected by the presence of bit errors during transmission.It is necessary to distinguish between the bit and packet error rate.If the transmission environment does not use the Forward Error Correction, just only one poorly transferred bit has to cause dropping of the entire frame / packet.To derive the relationship between the bit and packet error rate is not easy because it depends not only on the size of the Bit Error Rate (BER) but also on their statistical distribution and size of transmitted frames / packets.The packet error rate of connection is important for users.
When transmitting the voice, jitter is also an important parameter of the network.If the size of jitter is too large for the received packet and at the same time it is not possible to compensate it by the delay of a de-jitter buffer, each such packet is discarded and increases the packet error rate measured at the input of the voice decoder.In the following subchapters, we focus on the impact of the packet error rate at the input of the decoder on the voice quality.

Network Planning for VPP 60 ms under G.107
When designing and planning networks carrying the voice, a number of tools and recommendations can be used [9,10].ITU-T G.107 Recommendation allows, knowing the transmission parameters, to determine the expected call quality.If we consider the proposed change of the VPP from 20 to 60 ms, this change is reflected directly only to the voice delay.The mean one-way delay increases by 40 ms, the round trip delay by the double, i.e. about 80 ms.This increase is not vital, it does not deviate from national or international standards.If we use the default values for all other parameters entering into calculation of the R-factor, we obtain the value of R-factor R20 = 80.94 for the G.729 codec and VPP 20 ms, which corresponds to the value of MOS20 = 4.06.If we use VPP 60 ms, we obtain the value of R60 = 80.04, which corresponds to the value of MOS60 = 4.03.The effect of delay is of no consequence and it does not require further consideration.

Fig. 8 Comparison of packets loss for different VPP
However, the situation changes when packet losses occur.If VPP 20 ms is used, the loss of one packet represents the loss of smaller amount of information than the loss of one packet when using VPP 60 ms.The E-model uses 3 parameters to evaluate the impact of packet loss on the speech quality: Random packet-loss probability (Ppl), Burst ration (BurstR) and Packet-loss robustness factor (Bpl).The Bpl parameter represents the ability of the codec to deal with the loss of packets.The values for the most common codecs are given in ITU-T G.113 Recommendation, the value Bpl = 19.0, is used for the G.729 codec.The BurstR parameter evaluates the occurrence of bursts of lost packets.If three consecutive packets are lost, the value is equal to 3. In case of independent random packet losses, the BurstR parameter equals 1.If VPP 60 ms is used, the loss of one packet can be replaced by the loss of the burst of three packets with default values, and successful transmission of one packet is equivalent to the successful transmission of 3 packets with default values.This situation is illustrated in Fig. 8.
The packet error rate of both flows compared is identical.Therefore, the network using VPP 60 ms may be evaluated with the E-model so that we consider the network with VPP 20 ms where only bursts of losses of three consecutive packets occur while the resulting packet error rate is identical.Fig. 9 shows the calculated values for the use of VPP 20 ms and VPP 60 ms upon the packet error rate 0 to 20 %.

Fig. 9 Dependence of R-factor on packets loss
In commercial circuits, the national legislation requires the provision of calls with R-factor = 50 or higher.According to the calculated results, the marginal loss rate of 11 %, using VPP 20 ms, and of 8 %, using VPP 60 ms, corresponds to this situation.Expected behaviour of the analyzed network, derived on the basis of E-model application and R-factor calculation, under the condition of only isolated random packet losses occurrence, suits with its call quality up to 8 % loss rate.With low values of loss rate (up to 3 %), the voice quality is comparable for both values of the VPP.

Practical Experiments
The planning and monitoring voice quality and real-time services is still a highly current topic.The publication [11] deals with possibilities to reduce the packet loss rate and to increase the throughput in mobile networks.The results published show inter alia that when the packet size increases at lower error rate for all bandwidths, there is a marginal decrease in the packet loss rate.In case of the application of this conclusion to the network examined, we can say that the increase in the VPP to 60 ms leads to minor reduction in the packet loss rate.
In the previous chapter, we choose a theoretical approach to determine the expected voice quality.To verify the results obtained, we decide to carry out practical experiments and monitor the voice quality.Appropriate methods and monitored parameters are described e.g. in G.1070 Recommendation.In the publications [12], current suggestions of new models of the voice quality evaluation under conditions of existence of clusters of packets loss appear.However, we have chosen to monitor the quality of voice directly by Cisco 7960 phones.The phones are equipped with 8.0 (5.0) firmware version, containing Listening Quality K-factor scores (LQK) application (version 0.95), that allows monitoring the call quality.LQK scores are produced by a Cisco proprietary algorithm that is an implementation of P.VTQ, an ITU provisional standard.

Fig. 10 Workplace topology
Experiments are carried out in the workplace, consisting of two Cisco 2811 routers, two Cisco 7960 phones, and a laptop with CipherCAD SW.The topology of the workplace is shown in Fig. 10.
All calls made are routed from the router to the laptop and then to the other router.Using the programme created, it is possible to controllably drop packets.Measurements are carried out for the case of random packet losses when configuring VPP 20 ms, VPP 60 ms and then for the losses of the burst of three packets with VPP 20 ms.The same measurements are performed for the case of periodic losses.The calls placed are in the duration of 150 s and after their termination are read minimum MOS values of the particular calls.The results are shown in Fig. 11.
The results show that the quality of voice (in the examined range and under condition of individual losses) decreases linearly with the increasing packet loss rate.
There is no significant difference whether the packets loss distribution is random in time or periodic.The increase of the VPP to 60 ms does not prove a substantial decrease in quality of voice.The value of R-factor R = 50 corresponds to MOS = 2.6.Fig. 11 shows that if the packets loss rate does not exceed 6 %, the measured call quality is sufficient in all measured cases (it meets national and international legislation).When the packets loss is about 20 %, the calls are already completely unintelligible.

Conclusion
The weak point of the field network is the bandwidth of E12 links.So we performed an analysis of options how to streamline the voice traffic.We have decided that the change of the voice payload size is the most appropriate method.To be able to calculate the bandwidth, we need to know exactly the link protocol and line code overhead.Therefore, we caught the communication at physical layer and on the basis of the analysis, we identified the overhead.Theoretical calculations which were verified in real operation have shown the possibility to save bandwidth by changing the size of the VPP.As optimal value of VPP, 60 ms with 50 % bandwidth savings was chosen.The proposed network improvement works reliably in the event of an error-free transmission.Next, we analyzed the effect of packets losses on the voice quality for VPP 60 ms.The theoretical results obtained, using the E-model, as well as the results of practical experiments have shown that when only isolated packets losses occur and the magnitude of losses does not exceed 6 %, there is no substantial deterioration in the voice quality.

Fig. 11
Fig. 11 Dependence of MOS on packet losses