Abstract
Nowadays, video streaming is very popular around the world, users use video streaming to watch online movies, education, and do office work. Video streaming is referred to as the transmission of video content, live or recorded from server/cloud to end-users. Video and music files are prearranged and transmitted in sequential packets of data so they can be streamed instantaneously, only the User required a high-speed network for access, and a subscription to streaming via an application. In this paper, we survey and analyze the previous development in video streaming such as 2D, and 3D video streaming, compression technologies, protocols for streaming, cloud video processing, 4K/8K, and challenges, and limitations and offer aspects of future development, which will help to provide quality of service of video streaming and increase the revenue for service providers.
Introduction
Streaming is the consistent transmission of sound or video files from a service provider to a customer [32]. In easier terms, streaming is the thing that happens when shoppers stare at the TV or tune in to webcasts on Internet-associated devices. Streaming is a method of sending and receiving information on a sound, video, or music track in a continuous stream over a network system [46,64]. The file play while still sending the remaining data, for example, you are ready to watch a video while getting only the start of it on your computer or smartphone. Service providers can use large server and database networks that can support millions of users at once to offer services for viewing [38,83]. Big streaming companies like Netflix have cloud computing mechanisms in place that keep the most common content reserved and near to where it can be accessed, lowering latency and streaming expenses [75,80].
Being a customer of internet-providing services, one will require internet of very good speed and reliability. The least speed to enjoy good streaming is 2 Mbps (megabits per second) [1]. It has a decent viewing experience, which means no buffering, video pausing, or low video quality. When streaming content, the data is first received in a buffer, which stores the data for the next few minutes of streaming content, whether it’s music, video, or a movie [4]. However, if your internet service is weak, you will experience frequent pauses, and the video will be blurry and of poor quality [107]. And if you desire of watching an HD or 4K format file the internet should be running at least 5 Mbps [3]. Live streaming works likewise to streaming different sorts of substance, however, it’s utilized particularly for uncommon occasions like games or political discussions. When watching a live stream, you visit a site, (for example, a news website) that is facilitated by a web worker. That is associated with a media worker, which supports the video to your device utilizing a Real-time Transport Protocol (RTP) and a Real Time Streaming Protocol (RTSP) [118]. It permits video records to be sent in a smaller (packed) structure and afterward seen in higher quality (decompressed) structure on your device.
Quality of service of video streaming is still a big issue for the service provider, although different compression techniques and streaming protocols were made by well-known organizations. Quality of experience (QoE) features was also added by an organization to know the user’s requirement and configure the server/cloud to provide services preferences according to following them but did not succeed at all [59]. This paper will address video streaming, the state of the art of video streaming technologies how they work. The contribution of the paper is based on the analysis of:
Differences between 2D and 3D videos, how they construct, and their properties.
The review of compression technologies such as H.264, H.265, VP8, VP9, and RV40, how compression algorithm works to compress the video, and previous research on compression technologies are also reviewed.
The video streaming protocols and how video is streamed live over the Internet.
How video is processed in a cloud environment is part of the review, and 4K and 8K streaming limitations and challenges with future development are also discussed in open research issues
The rest of the paper is organized into 9 sections. In Section 2 2D and 3D video streaming technologies are presented, and Section 3 is based on the video compression technologies. Section 4 presents the video streaming protocol, and Section 5 is based on cloud video processing. Similarly, Section 6 is based on the 4K, and 8K streaming, and Section 6 provides challenges and limitations of video streaming. Finally, in Section 7 we provide open research issues, and in Section 7 we conclude our work.
Methodology
This research is based on a systematic literature review. We have collected data from three major databases for systematic review, using these search tags. S1 (“Video” or “Video Streaming”, “2D” or 2D videos, “3D” or “3D videos”, “4k videos”, 8k videos, “Quality of Experience”, “Cloud-based videos” and “Comparison”) Elsevier, Spring link, Emerald, Wiley Online Library, and Google Scholar. According to PRISMA guidelines, those articles are filtered based on abstracts, keywords, titles, and results. After that, we used the systematic literature review (SLR) and system mapping study methods. Finally, we have discussed the complete structure of this systematic review and its process.
Research questions
Many of the published articles give us a general summary, which is dependent upon the reader’s searching wish. These articles show that many of the issues related to video streaming have been done and reported for a complete description of digital media. Researchers may learn many things about these current issues because we have observed them from published papers. Moreover, these published papers related to systematic research may help to conduct SLR. After that, we should formulate the research questions related to SLR and find out the answers from published and authentic papers from well-known databases. Some of these highlighted questions have been discussed:
RQ1: Why video streaming is an important service on the internet?
Video streaming is important because it attracts many users for E-Learning, Live streaming of the event and video conferencing, etc. It is sent from prerecorded video files, but it can be distributed as part of broadcasting. It can be easily forwarded via the internet and playback in real-time.
RQ2: What is the difference between 2D and 3D streaming?
What is the difference between 2D and 3D streaming?
The main difference is 2D and 3D, 2D is “flat”, using the horizontal and vertical (X and Y) dimensions, the image has only two dimensions, and if turned to the side becomes a line. 3D adds the depth (Z) dimension. This third dimension allows for rotation and visualization from multiple perspectives.
RQ3: What are compression technologies are used to reduce video size for streaming with high quality?
We have discussed compression technologies in the third heading with clear justification.
RQ4: What are the protocols currently used for video streaming?
In this SLR, we have discussed the video streaming protocols inside this manuscript.
RQ5: What are the problems organizations facing currently for 4/8K video streaming?
Currently, organizations face the availability of 4/8K supported devices on low cost and high-speed internet costs for end-user to access these contact. The bandwidth challenge is for an organization to launch high-quality content delivery to consumers. Further details are given in open research issues.
Selection criteria and non-selection criteria
Our selection criteria are based on well-known published papers and extracted from well-known databases, and the most important thing in it, we have just targeted the English language written papers from 2006 to 2021 in this SLR. We have categorized the selection and non-selection of published papers which are described in given below Table 1.
Selection criteria for publication
Selection criteria for publication
We have used the keywords (“Video” or “Video Streaming”, “2D” or 2D videos, “3D” or “3D videos”, “4k videos”, 8k videos, “Quality of Experience”, “Cloud-based videos” and “Comparison”) in Elsevier, Springer link, Emerald, Wiley Online Library, and Google Scholar.
SLR steps
This SLR is focused on Published Articles; those steps are used in this SLR:
Well-known published papers are extracted from Elsevier, Spring link, Emerald, Wiley Online Library, and Google Scholar. All research concerned papers are used in this SLR. Non-Related work has been excluded from this SLR. We have collected and observed our data from related articles, titles and Abstracts. Finally, most relevant data and published papers are citing in this paper for increasing the effectiveness of our SLR.
2D & 3D video streaming
2D video streaming
Simply we can say that 2d video is that which has 2 dimensions length and height [125]. Current 2D vivified recordings can be made either by hand or using PC and requires one picture to be trailed by another in a somewhat unique position, trailed by one more picture in another position, etc to make development [129]. Figure 1 shows the illustration of a 2D scene.

2D scene.
The art of creating development in a two-dimensional space is known as 2D animation. This includes characters, creatures, special effects, and frameworks. When individual sketches are sequenced together after a while, the deception of creation is created. Normally, activity is performed in “2s,” which means for 2 casings, a drawing is made (12fps).
Each shot in a 2D animation includes numerous single drawings of characters (although some PC projects, for example, ‘Anime Studio’, can make characters a lot like manikins, which can be presented at various keyframes, and the PC moves the character between those keyframes) [121].
2D support is available on request. In any case, the artist (s) must create the idea of the activity and make the whole a kind of drawing that becomes part of life. The arrangement of the drawings is then sequenced (combined in order) to take a moment of animation, with a longer function requiring a correspondingly greater number of drawings [48]. Usually, the moment of motion consists of 24 edges, where 2 edges consist of another drawing. At this speed, the change between contours occurs so fast that the development of the branches on the canopy appears smooth and solid to the natural eye. Many people create 2D motion with television shows and children’s cartoons, but using the style is much faster. 2D functionality is also dominant in promoting versatile applications and workspaces, video games, websites, and most other visual media.
Literature suggests that volumetric video captures objects from both angles and is a new immersive representation technique for 3D spaces. Using several cameras, it generates a complex 3D model of the action. The processing of volumetric information, on the other hand, takes a long time. For today’s handheld devices, processing capacity remains a major problem. To address this, the researchers suggest a volumetric video streaming platform that offloads rendering to a high-performance cloud server. Rather than giving the client the rendered 2D view, the server sends the volumetric content in its entirety [39].
Simply we say that 3d video has 3 dimensions length, width, and height [22]. The mainstream arrangement of a three-dimensional video is a stereoscopic video created by two video signals, one for each eye [123]. It is the picture-based organization utilized by cinemas and current 3D TV for home diversion. In its least complex structure, 3D liveliness is the way toward making three-dimensional moving pictures in a computerized domain [51]. A similar rule is utilized in 2D or stops the activity. The main contrast somewhere in the range of 2D and 3D liveliness is that in 2D, pictures are hand-drawn while the photos are PC produced in 3D.
3D motion pictures and those senseless glasses cooperate to send every one of your eyes alternate points of view of a similar picture. Pictures are anticipated in those hues – red and blue – and the uncommon glasses ensure each eye just gets one of the pictures. As usual, your cerebrum assembles the 3D impact.
The study was conducted by Ramesh on the need to provide the RESCUE system for the development of 3D remote sensing videos for reliable and speedy communication in emergencies. Inefficient pipeline operations, the proposed system uses low memory on mobile devices. Several approaches have been proposed to transform 2D videos into 3D, where developers convert 2D to 3D of high quality. We recommend a way to use either a dataset or heuristics pre-classification based on the consistency of streaming attributes. By using a profundity profile with the method buffer, segmentation, and refinements, the assessment process automatically modifies remote sensing videos from 2D to 3D. The conventional Hough algorithm for transformation does not work for hardware, because it uses the color histogram to help the slow-motion of frames, resulting in excessive time to convert 2D to the 3D unit. The RESCUE framework is thus being used to provide appropriate solutions and to solve the shortcomings of the current methods with its expanded vanishing point and line algorithm. Compared with JAVRAE, the framework is a lightweight method that uses less telephone memory and less processor time [111].
Studies of 3DQoE-Oriented and Energy-Efficient 2D plus Depth based 3D Video Streaming are well documented; it is also well acknowledged that in the context of video distribution, IP networks have exceeded all others. However, bandwidth-hungry content is driving networks to their limits: operators’ costs are increasing, and consumers’ viewing experiences aren’t always fulfilling. Because of the higher amount of data that must be shared and the difficulties in defining the end user’s visual experience, the former issues are compounded when it comes to 3D content distribution. As a result, network providers could be unable to provide 3D video to their customers due to costs and ambiguous quality changes. The truly immersive experience of 3D video remains elusive in this environment. In this paper, they look at how to distribute 3D video over centralized networks efficiently in terms of both quality and energy expenditure. A software-defined network (SDN) is an example of a network that is managed from a central location. Our method is based on a network-dependent 3D quality of experience (3DQoE) model and a 3D video streaming energy cost model. They formulate the issue of energy-efficient and 3DQoE-optimized 3D video flow route routing using existing models. The video/depth rate allocation capability of 3D content can easily be factored into the best routing routes for multiple 3D video sources. The NP-hard problem is solved using a heuristic algorithm based on the branch and bound approach after a substantial reduction of the solution search space. Over an OpenFlow-based SDN, extensive 3D video streaming experiments with subjective and objective analyses are conducted, showing the significant effectiveness of the suggested solution [79].
Producing motion in 3D is not a simple procedure and there is no comprehensive way to approach it. Ultimately, we can represent it by looking at the second approaching principle. From the very beginning, there was a hand-drawn methodology. As in 2D, 3D livelihoods can be shaped and associated with sequencing, but this methodology is rare. Generally, PC programming is used, and this methodology includes 3 fundamental advances:
Visualization. At this stage, all 3D questions available in the activity are carried out in the 3D motion program [89].
Design and Animation. The foundation and arrangement of activities are carried out, and mechanics (development, change) are included [41].
Rendering. 3D items, designs, and mechanics are fully strapped and captured to create a finished item (liveliness) [112].
Usually, there are lots of additional advancements (including audio cues, trim, and composition for example) that can contribute to making a live cut, but the 3 steps above are core needs.
3D and 2D videos which is better
For Educational Contents.
3D animation is much more effective than 2D animation for students [30]. One study found that 81.6% of 3D content results and 2D content results in 76.1%. So we can say that 3d animation is better than 2d animation.
For viewers.
3D animations are better than 2D for viewers also in theatres when you watch 3D animated photos, you realize that you are in the reality of the movie and it takes energy and a rush to watch the movie [44]. In a 2D movie, you just look at the big screen with home sounds, the same old thing, no effect, and no special concept like 3D when you wear 3D glass. 3D videos or movies provide more enjoyment for the viewers than 2D videos.
Difference between 3D And 2D Videos:
As far as the method used to make it is concerned, 3D motion is fundamentally different from 2D. 2D represents layouts with edges, while 3D or three-dimensional activity is more extensive and reasonable. In 2D, we can see pictures of all the points’ one after the other. In 3D, it makes better use of the computer [96].
Divisions using two-dimensional movement: Movies, TV commercials, children’s shows, movies, and computer games. As well as medical, 3D motion must be used in aerospace, architecture, engineering, biotechnology, toy industries, and motion pictures.
3D activity is commonly used in the medical and space fields. Design, biotechnology, engineering, the gambling industry, and movies have also deteriorated [14].
In 2D, the product uses Adobe After Effects, Adobe Flash Professional, Motion, Tone Boom, and Anime Studio [136].
DS Max, Autodesk Maya, Cinema 4D, Hodini, Z Brush, and Blender are part of the regularly used programming that provides 3D viewing [9].
Examples of 2D animation include Duffy Duck, Big Bunny, Snow White, The Little Mermaid, Family, and more.
Examples of 3D animation include films such as The Transformers, Toy Story, Co-op, and Jurassic Park.
In normal 2D motion, everything you see is hand-drawn and sketchy. During the 3D activity, you animate characters and objects in 3D position using 3D Motion Programming, where you can control these characters and items. An added benefit of drawing is that it is only necessary for 3D motion.
In 2D activity, painters use characters, visual effects, and foundations to seriously enhance the development of two-dimensional space in the sequential order of each drawing, within a specific time.
The characters and elements used to control 2D movement consist only of height and width, while 3D animation involves control of 3 steps, size, width, and depth of characters and objects, and 2D is more practical than the letter.
Video compression technologies
Compression video technology can be defined as the reshaping of video in which the video is reduced, consuming less memory and space as compared to its original size [139]. That enables us to transmit data through the internet or any other network easily, as compared to its original size. The word compression in this term refers to the elimination of redundancy and non-fractional data [131]. By doing this the size of the media file is greatly reduced and easy to access and transfer. Uncompressed videos and media files occupy a huge amount of data and storage that are shot from the camera (e.g. movies) [113].
A lot of studies have suggested that Video compression and content consistency have become a thrilling and daunting concern with the dramatically growing use of digital technologies and gross video consumption. Recently, under the Joint Exploration Model (JEM) software, a new coding tool was created to provide a strong bit rate relative to the HEVC standard. In this article, we present a distinction between JEM and HEVC by employing quantitative metrics and a sub-target consistency evaluation. This paper is focused on results. The two High Definition (HD) and Ultra-High Definitions (UHD) spatial resolutions are using a range of video sequences. These videos are encrypted at various bit rates in both JEM and HM apps. Data have shown that the JEM codec allows for an increase in consistency by up to 40% at equivalent low bits subjectively. Objectively, based on the spatial resolution, this efficiency gain ranges from 35 to 37 percent. The HM comparison program, at a high rate, allows a high-quality video and therefore the increase in quality is made harder to detect by the JEM codec. Moreover, some video content is difficult to encode, and the JEM therefore only makes a small perceived increase in quality possible, especially at the most important bits and 4K resolutions [120].
Explanation of algorithm and video codec
Algorithm:
The compressing process of a video file involves the appliance of an algorithm that creates a compressed file that is capable of storing and transmission [91]. An inverse algorithm is also applied to the file, to generate a video, showing virtually the contents as the original file.
Codec:
A video codec can be defined as the pair of algorithms that work together (coder/decoder) [134]. An issue may arise while using two different standards of codecs, a video file getting compressed using a particular standard cannot be decompressed using different standard compression technology [122]. For example, a V8 encoder will not be compatible with an H.265 decoder. The reason for this trouble error will be that, both the standards use different algorithms and the output generated by one algorithm cannot be decoded by using another different algorithm.
How a video compression algorithm works?
The algorithm works by first looking at the spatial and temporal redundancies. The size of the file is reduced as the algorithms perform encoding on the redundant data [114]. For example, a 60 second shot of a character changing its facial expressions simultaneously. We encode it once, instead of encoding the background image for each frame. This is called inter-frame prediction and this inter-frame prediction is responsible for irritating facts of a digital video.
Inter-frame and its types:
Inter-frame is also known as I-frame [73]. The term “inter” reflects the inter-frame prediction. It is a frame expressed in terms of one or more than one neighboring frame. It works by removing temporal redundancy enabling greater compressions.
Inter-frame predictions:
The inter-coded frame is divided into macro-blocks (a division of the image into tiny small boxes) [132]. It does not encode directly with pixels; it finds a block similar to the one it was encoding previously, which is known as a reference frame. Block matching algorithm performs this process. When the check is effective, the frame will be encoded with a motion vector pointing in the direction of the reference block. This whole process is called motion estimation. Sometimes identical blocks are found and sometimes they are not found. Then encoder will start measuring the difference. The difference values are measured in variables and those variables are called errors. Then
In short length, it can be described as, the encoder obtaining motion vector, which will point towards the matched block and a prediction error. Then the decoder will recover the raw pixels of the blocks using both of them.
DRAWBACKS OF INTER-FRAME:
If the whole scenario works out fine and algorithms can match blocks with the errors occurring, the overall size will be less than the raw encoding. But if it fails to find an identical block, an error will be considerable. The overall size will be greater than raw encoding, then the encoder will assign raw encoding for block (only for exceptional cases like this)
If the inter-frame prediction was used to encode the matching block too, the error will be propagated to the next block. If each block was encoded using this there will be no block left for the decoder to sync with, leaving no blocks to get reference image.
Due to these drawbacks, a reliable frame and periodic frame must be introduced for it to be ideal. This reference introduced is known as intra-frame.
TYPES OF INTRA-FRAMES:
Ina sequence of an image, the frames is further classified into 2 categories [19,28]:
P-frames
B-frames
B-FRAMES:
It also known as bi-directionally predicted pictures. It occupies the least coding data as compared to P-frame and I-frame (close to 25% as to I-frame) because the prediction is made on ‘later frame’ or the starting ‘earlier frame’ or maybe from both of them.
P-FRAMES:
Also referred to the term predicted pictures. It requires less coding data as compared to I-frame (nearly 50%) as it makes predictions on an earlier frame. The required data for performing this prediction is based on transformed errors into variable∖coefficients and motion vectors. Motion vector compensation is also involved.
Examples of video compression standards
Compression of video technologies is an overview of techniques for pull out or reducing or less excess in a video data store [69]. The compression is surely smaller in weight as matched to the uncompressed video. This helps video save as lesser size and sent over by network as quickly as possible. The efficiency of video compressed is related to bitrates for the given framework [81]. If it results in lower bitrates so the efficiency of compression is more efficient.
The Video compression may be lost where an image of quality is not so good but different between the real images [67]. For loss compressed the goal is to level up compressed technology, which is very profitable for making and result in perceptually lossless of quality [135]. In the effect in that thing but the compressed video very looks wide from the real un-compressed video, the similarity is not so easily viewing to the eye of humans. That’s the main thing in that function. The Compression video efficiently takes considerable processing power and time. The Video data is denoted by series of all frames or fields for the interlaced video technology. The numbering of all frames and techniques should almost contain both the Spatial and Temporal Excess and that video compressed algorithms can be used more. Mostly the videos of compression of algorithms are very useful for both of the spatial compression because they excess on the spare within every frame or field, and a temporal compressed are consider on excess between the different types of videos.
Video compression standards
There are some of the popular video compression standards as follows
H.264
H.265
VP8
VP9
RV40
It is intended for higher compression of motion pictures in different types of applications such as video conferencing HD video streaming, digital storage of media, television live broadcasting, and other types of communication [25]. H.264 has evolved from the last/ previous video of coding Include ITU.T H.261, ITU.T H.262, and ITU-T H.263 [128]. The video coding standard continuous to add some Latest profile and define supplement data to take time and extend the range of applications addressed by the standard codes H.264 is available and part of our HD video and video ultra HD PBX design solution.
The first step in H.264 AVC decoding is to analyze and disassemble the bitstream [12]. The encoded macroblocks are usually decoded in the entropy of the decoder and must be inversely quantized and inverse the scaled. Then the inverse transform is also applied and the residual of the macroblock is formed. The Macroblocks containing more information about the intraframe of prediction and interframe all motion vectors are also decoded.
The biggest and useful edge of H.264 over the previous standards of compression performance is likely with standards like M-PEG-2 and M-PEG-4 visual [13]. It could deliver good image quality at the same bitrates and lower compress bitrates for the same quality.
The H.264 offer great flexibility and durability of compressed types and Transmission Bridge of h.264 encoders can be move from ultra-wide options of compressed tools and range of suitable for all applications from low-bitrates to low-delay of mobile even that time of high-quality TV to professional TV Production [87].
Some 4/AVC Range of Application:
· High Definition of TV Broadcasting in Europe
· Mobile TV Broadcasting
· Video conferencing
· High definition DVDs
H.265 (also known as High-Efficiency Video Coding (HEVC)) is the most recent video coding format that is an evolution of H.264 (also known as Advanced Video Coding) (AVC) [26]. The end goal of this specification is to have the same or better picture quality while improving compression efficiencies to make major data samples more usable and reduce total storage requirements.
H.265 compression uses a form that is almost equivalent to H.264 compression. Despite encoding all pixels from all frameworks, bandwidth consumption can be reduced or cut in half for specifying the static in areas that cannot shift from frame to frame, and specific encoding can be applied to areas that change pixels. With a 16 × 16 pixel contrast and a 64 × 64 pixel equivalent, it’s more provocative [130].
There are some Problems in H.265 [82]:
Power Processing
Multi-Screen Limit
Camera Control
Latency
The hardware of Up-gradation
These are the main problems that affect brutally in a digital way.
Over time, extensive literature has developed on Multimedia services which have exploded in popularity in a short period, especially in the video realm. As a result of this trend, there is a greater need for video content evaluation. Compression technology and transmission connection are the two most significant factors that impact video quality. When opposed to the previous standard of video coding, H.264/MPEG-4 AVC, the standard of new video coding H.265/High-Efficiency Video Coding (HEVC) has seen major advancements in video encoding. This paper assesses the major quality of video according to H.265/HEVC compression standard and investigates the correlation between the quality of video and bitrate under various GOP trends. The assessment is carried out using quantitative metrics for four video clips. Experiments show that under the same GOP pattern, a higher bitrate results in improved video quality. Also, as the scale of the GOP grows, the video quality improves while the number of B-frames remains constant. It was also noticed that the compression’s efficiency is influenced by the quality and complex state of the video sequences [138].
The VP-8 is a general type of specification and also encoding and decoding these two high-quality videos matched file or a bitstream for showing long run away [6,23]. VP-8 contains in the web m open-source project, which is sponsored by a Google along with VP9. Unlikely its H.264 codec, VP8 is a type of code that is free for high definition. This is due to Google have released all the VP8 patents and its own under a royalty-free public license that is the most important thing. H.264 contains several technologies and requires licenses from patent holders and limitation of royalties-free hardware.
VP-8 Are Supports only progressive scan video signals [6].
VP8 Support multi-type of core processors.
VP8 adapted badly to high-resolution HD with the help of three buffers it contains a small memory footprint.
Constructed Reference Frame
More Core Better
Simplicity
Faster Sub Pixel
A major design in VP8 was simply the decoding process [8]. The Decode performance over the low-powered DSPs and in also particular in the world’s most ubiquitous microprocessor and the ARM-9 was also a significant material parameter from the of the set.
The VP8 also supports the use of a 2tap sub pixel option instead of its normal point filter. Even though the standard of 6-tap filter used in VP8 is also not too complex form than the two sides and stage filter process used in H.264 but there is optional 2-tap filter can further reduce the number of search operation by as much as 50%.
The VP9 is a new generation of the graphic video compression format, which is made by the Web-M and the main Project of VP9 contain are a full range of web and mobile use cases of low from low bitrates’ compressed to high-Definition ultra-HD, with a superb option for 10/12 bits encoding and HDR. Vp9 reduces bit rate by about 50% compared with another thing codec [2,124]. VP9 is support chrome, opera, Firefox and edge.
This offer of Codec provides the best video quality in the same bit-rate of (HEVC) and also the best effect for the service 4K HD videos allowed online watching [94]. This efficiency and also Running of VP9 are most similar to AV1 that’s the thing both look the same, it is often termed as AV0 and also a previous version of AV1. Quality of comparison of VP8 and VP9 is given in Fig. 2.

Comparison of VP8 and VP9.
Real video is a format that suite of video compresses proprietary is developing by Real Network. It is specifically changing the version [144]. It was released in 1997 and in 2008 version 10 was in .RV real video support various types of sources like Linux, Solaris, Windows, Mac, and several types of smartphones [15].
The new version of Real Player that can be mostly running on Windows 9 is Real Player 8 but that version can be very easily modified to play Real Player 9 and 10 Files of Player, by the manual addition of just three figure. Delete files of codec and Plug-In from Microsoft’s free distribution of RealPlayer 10, which cannot be included in RealPlayer especially in 8.
This section reviews the literature related to WebRTC, which is a catch-all word for several emerging Web applications aimed at exchanging real-time media. WebRTC communication’s perceived efficiency, like that of other media-related services and calculated by Quality of Experience (QoE) metrics. Subjective (users’ test scores) and analytical (models calculated as a result of various parameters) QoE appraisal approaches exist. VMAF (Video Multi-method Evaluation Fusion) is the subject of this article. When evaluating video streaming platforms, the Video Media Assessment Framework (VMAF) is also used. This article looks at how VMAF is used in a certain type of application, WebRTC. To that end, the researchers have utilized well-known open sources technologies such as JUnit, Selenium, Docker, and FFmpeg. In addition to VMAF, the researchers use a WebRTC relation to measure other objective QoE video metrics including Visual Information Fidelity in the Pixel Domain (VIFp), Structural Similarity (SSIM), and Peak Signal-to-Noise Ratio (PSNR) in terms of packet loss under different network conditions. Finally, the researchers use a Mean Opinion Score (MOS) scale to compare the objective results to a subjective evaluation of the same WebRTC streams. As a result, in addition to SSIM and PSNR and their derivatives, we discovered a clear correlation between the subjective video content perceived in WebRTC video calls and the quantitative findings computed with VMAF and VIFp [34].
A video streaming protocol is a structured distribution mechanism for splitting up a video into pieces, delivering it to the viewer, and putting it back together [35]. Let’s begin our analysis of the most popular video streaming protocols nowadays when you have a clearer understanding of their meaning.
Real-time messaging protocol (RTMP)
RTMP is used to consume live streams as a rummage sale [78]. When you established your codec to send your Livestream to your web host site, it will use the RTMP protocol to communicate with the CDN [95]. The content is ultimately delivered to the end-user using a different protocol, typically HLS streaming. RTMP is no longer widely used as a viewer-facing video streaming protocol. Since it relies on the Flash module, which has been hampered with privacy issues for years and is quickly becoming outdated.
RTMP is a sinuous protocol that delivers identically little latency streams [43]. However, since that one need the Moving plugins is to be playing, we do not recommend it. Again the exclusion is intended for rivulet incorporation. For that purpose, RTMP is quite one of the finest options. It is strong and practically commonly supported.
It was discussed in different research studies that Streaming video content accounts for 57 percent of mobile network traffic; video streaming is a technology that requires performance, security, and network consistency. In this study, the researchers compare the performance of three separate video streaming protocols in mobile networks: RTMP, RTSP, and HLS. This assessment looks at three things: reliability, the battery life on mobile devices, and power consumption (CPU, RAM). The assessment is being carried out in the Ecuadorian city of Guaranda, where mobile network coverage is very flexible, beginning with the 4G network, which has limited coverage, and the rest of the city used between 3G and 2G, which makes coverage across the city very changeable, which is ideal for carrying out the proposed evaluation. The HLS protocol, which provides a compromise between the three points examined and the best results, was found to be the best protocol that can currently be used for mobile networks in the City of Guaranda [146].
Advanced transport satellite protocol (ATSP)
Maybe a smaller known videotape streaming protocol, Actual times Streaming Protocols was main available in 1997. The actual-time streaming protocol was established to regulate streaming television waiters in showbiz and transportations systems, especially. In 2016, a restructured ATSP 2.0 enhances accessibility [93]. Overall, it is recognized by way of a movie issuing protocol for founding and regulating broadcasting meetings amongst endpoints. In several ways, the real-time streaming protocol is similar to the HTTP Sentient Streaming (HLS) protocol, which we will discuss later. However, RTSP’s grant would not have the ability to communicate live streaming records. RTSP attendants, on the other hand, function in tandem with the Actual-Time Transport Protocol (ATP) and the Actual Control Protocol (ATCP) to carry transmitting streaming.
ATSP was created to alleviate limited inactivity broadcasting, and it is well suited for streaming IP camera streams (such as security cameras), IoT devices (such as laptop-controlled drones), and smartphone SDKs.
Vibrant adaptive streaming above HTTP (MPPG-DASG)
At the disagreeing surface in the span, we had MPEG-DASH, one of the latest procedures on the scene [45]. While it is not broadly using still, this protocol had certain large advantages. Firstly, MPEG-DASH cares adaptive of bitrate the streaming. These earnings spectators resolve continually to obtain the finest video class that their present internet linking speediness can support. This inclines toward varying seconds to seconds, and DASH can save up. MPEG-DASH repairs certain extensive standup strict matters with transfer and condensation. Another advantage is that MPEG-DASH is ‘codec agnostic’ importance it could be cast off with nearly some programming form at. MPEG-DASH too cares about Encoded Media Extra time and Media Font Delay, which is morals, constructed APIs for browser founded numerical privileges organization.
These days, MPOG-DAFH is a single creature, similar to HLS that has been cast off by a segment of expert newscasters. However, we are certain that it will become standard knowledge in the future. The fact that this protocol is not widely used can be attributed to usability issues (for example, iOS and Apple Safari do not support it), as well as other related issues.
Microsoft smooth streaming (MSS)
Following up is the Microsoft Plane Running protocol [52]. Firstly presented in 2008, MSS was important to of that year of Summers Olympic. However, it is admiration has fallen, but between Microsoft attentive inventors and of those employed on the Xbox ecosystems. Plane Streaming cares adaptive on bitrates streaming and contains a certain healthy tool for DRM. Overall, it is a mixture broadcasting conveyance technique that purposes similar streaming, still is built on HTTP advanced downloading.
Except your key objective spectators is Xbox operators or your strategy to form Windows precise apps, we do not endorse consuming MSS as a main video streaming protocol.
HTTP dynamic streaming (HDS)
HTTP Dynamic Streaming (HDS), the successor to RTMP, is Adobe’s entrance into the streaming protocol community [29]. HDS is a flash-based streaming protocol, similar to RTMP [18]. It does, however, support adaptive streaming and has a high-quality reputation. When it comes to latency, HDS is one of the best protocols. Due to the fragmentation and encryption method, however, latency is not as low as with RTMP, making it less common for streaming sports and other activities where seconds count.
We do not advocate using HDS in general. Flash support has deteriorated in recent years, making it impossible for any broadcaster to meet their target audience with this technology. In short, relying on the Flash player to create a web video is a bad idea these days.
Previous studies have emphasized that Web services are software organizations that enable machine-to-machine communication, operate as a standalone computer, and provide network interoperability using simple web technologies including HTTP and XML (XML). Owing to the rapid growth of cloud services and networks, as well as the introduction of numerous digital transformation solutions, an increasing number of conventional apps are being converted to web-based systems. A service-based approach has a huge impact on multimedia content storage, recovery, dissemination, and collaboration. Get commercially competitive, real-time, high-resolution immersive content on the go anywhere and wherever you want it with adaptive HTTP streaming, thanks to cloud technologies and assistance. This article discusses the evolution and architecture of adaptive HTTP streaming distribution as a web-based content delivery solution, as well as the transition of adaptive HTTP streaming from conventional networks to a cloud-based platform. The measurements from a real-time trial were compared to the findings of a comprehensive performance assessment of traditional and cloud-powered service-based approaches. In comparison to the traditional adaptive HTTP streaming solution, the cloud-based multimedia content delivery approach has benefits, yields positive results, and provides a superior user interface [53].
Video procedures for proficient live streaming
Many video streaming protocols are available nowadays. All of them may be used to broadcast live data streaming. Many of the protocols including – RTMP, RTSP, MPEG-DASH, MSS, HDS, and HLS – have particular use cases for specific broadcasters, as we discussed earlier [10]. When all factors are considered, however, HLS wins out, especially in terms of codec compatibility, cross-device compatibility, native HTML5 video player support, and adaptive-bitrate streaming capability [20].
The bottom line is that, for the time being, nearly all broadcasters can adhere to the HLS video streaming protocol. Certain consumers can feel that other methods are more suitable for their requirements. HLS is usually the safest place to go if you want to watch live content on the internet, do sports broadcasts streaming, or show sporting events and meetings live [56].
Recent research suggests that as a result of the massive increase in high-tech devices, mobile data traffic has gotten a lot of attention. The need for high-resolution content and video streaming for smartphones has gradually grown with time. Mobile Augmented Reality and Virtual Reality (AR/VR) technologies, in particular, are expected to be in the list of difficult applications to date to deploy over cellular networks. The cellular network architecture has been more streamlined over time, making it harder to compete with the exponential rise of mobile device traffic in terms of wireless link capacity, bandwidth, and backhaul network. Mobile devices continue to look for common forms of data at various intervals, causing a bottleneck in the backhaul link as well as an increase in overall network traffic. To solve these issues, new techniques for storing typical content and performing computation at the network’s edge are gaining traction. The advent of such strategies for near-future 5G networks would reduce the end-to-end latency of AR/VR implementations by placing fewer burdens on backhaul connections and cloud servers. In the light of emerging technologies, this research highlights current approaches to edge computing. It also discusses the influential caching methods at the edge. It also offers a pathway for 5G and upcoming cellular technologies [31].
Cloud video processing
Cloud computing contains the functioning of the functions and capabilities of a computer while accessing services through an Internet platform [58]. It is typically used for computing, storing, and managing data outside of a company’s local computers. Companies around the world depend on the cloud to manage their data to create and execute workflows. However, the concept of cloud computing has only been around for over a time, and many business leaders do not fairly understand what the cloud is.
There are many resources available to help us learn about cloud computing and its benefits. One such resource is YouTube, which has no shortage of spaces contains many videos designed to educate viewers on how cloud computing works. From intellectual concepts to detailed tutorials and everything in between, these videos are a great asset if we consider ourselves a starting point when it comes to cloud knowledge.
Video processing necessities are becoming more multifaceted as media feasting ways change and the number of customer devices capable of presenting video rises, needing more and more file formats to be supported. With real-time and on-demand content viewing across multiple screens, media companies increasingly have to estimate the amount of infrastructure available for purchase to meet growing demand without over-investment [63]. The video processing method is given in Fig. 3.

The video processing method.
Achieving reliable and high-quality viewing across multiple screens requires significant video processing, as was the case at the Olympics last summer when document transmission was drawn at 2.8 PB of content with a maximum speed of 700 Gb/s, once Bradley Wiggins was in the lead gold medal in men’s cycle riding [115].
A series of recent studies have indicated that many new Cloud computing and edge technologies, such as Artificial intelligence (V/AR) and autonomous vehicles, depend on efficient video stream processing. Real-time, high-throughput video processing is needed for these applications [16]. An Edge-Cloud architecture, which is a collaborative processing paradigm that combines the edge and the cloud, is used to achieve this. Many approaches, especially for NN-based approaches, have been explored to improve the synchronization and bandwidth utilization of Edge-Cloud video processing. This is an illustration. We investigate the effectiveness of these NN methods, as well as how they can be combined and whether this increases performance. Participants are encouraged to try out different NN methods, mix them, and see how the underlying NN improves as a result of the different techniques, as well as how these changes influence the precision, latency, and bandwidth usage. The study suggests a presentation that demonstrates the nuances of mixing and using NN techniques for the processing of video in real-time. Splitting, differential interaction, and compression were the three techniques studied, and methods for integrating them were developed. The study focuses on providing a sense of the difficulties and trade-offs that come with mixing NN strategies, as well as the importance of proper configuration [37].
Media establishments that do not have enough video processing organizations to encounter changing demand can catch it tough to keep customer prospects for high-quality services. Manufacturing businesses that purchase extra substructure than they will require face needless maintenance charges. In the end, they lost cash [68].
The cloud bargains a balanced and cost-effective solution to the problems of inconstant demand by proposing the capability to immediately increase video processing capability to provide lodgings high-traffic proceedings and then decrease once more when circulation decreases, although escaping further hardware investments that are not used reliably. Just like paying for services like water or electricity, media companies can exchange extra [55].
Capital funds with additional expectable operative costs that growth and drop reliant on the number of cloud properties used. Cloud computing be able to help administrations handle varying video processing demands with great flexibility and quickness, facilitating them to expand customer service while decreasing capital charges [60].
Cloud computing also offers rousing chances for media and entertaining companies looking to minimalize the risks of beginning new creativities [65]. For example, broadcasters can rapidly roll out new programs or channels using cloud-based transcoding and measure their achievement without spending on the extra substructure. Once new initiatives are successful, you can confidently invest in on-site infrastructure to balance the economy over the long term.
Similarly, the cloud makes it possible to implement a one-time project deprived of long-term investments. For example, several media companies have large content catalogs, but the support substructure necessary to transform video libraries into a new circulation format can be expensive and incompetent. By leveraging the cloud, newscasters can expand their offerings and generate unrealized proceeds to make archived videos available to customers on demand.
Several studies suggest that improved communication development generation technology and the growth of explosive boom of the traffic where video traffic added by the fast boom of cellular devices (along with wearable gadgets devices and smartphones gadgets devices), high-quality enterprise opportunities prospects had introduced video carrier benefactors. This paper makes complete usage of computing capacity cache and the edge cloud. Given the multi-bitrate of video, it offers to capitalize maximized on profit to the video carrier company. The design suggests the problem as a 0-1 problem for optimization and the development of the learning upper trust algorithm based on a multi-arm bandit theory. This algorithm plans to develop a consistent solution cache and system in real-time to keep the user’s demand for video in line. The key results show that the technique suggested is different than other schemes. Therefore, not only the benefit of a video supporter but also the enjoyments of the consumers who want service efficiency are achieved by the proposed video caching and processing scheme [42].
Overcoming challenges
The cost of video processing in the cloud was excessive, especially for large manufacturing plants [149]. But as more and more vendors such as Google, HP, Rackspace, and Amazon Web Services (AWS) arrive in the market, struggle and increased source have brought down the cost of cloud properties [110]. As these suppliers remain to rise, users will advantage from economies of scale that create large-scale video processing further reachable in the cloud.
Second, data transfer is a major concern for establishments affecting great-level class video files. But this problem is solved by the fast data transfer system. For example, Aspera and Signiant have developed extremely effective transport skills that transfer records at the fastest speed, unrelatedly of file size, transmission space, or network situations [102]. In accumulation, Amazon newly introduced its Direct Connect service, which creates it almost as fast as moving records between on-premises and AWS, as moving data over a high-speed LAN. These innovations help reduce video transmission blockages.
Edit video in the cloud
There are several profits to bringing our video construction to the cloud [92]. The capabilities of the cloud are especially valuable for video editing, cooperation with others (frequently distant workers), and for keeping our files central online. Video editing in the cloud is not a one-size-fits-all solution. Reliant to what we want, we may need to explore dissimilar kinds of cloud video editors (or it may go out that desktop editing is exactly what we need).
The way to use the cloud for video production is to attach the computing power of hundreds of remote machines [140]. A virtual machine is a powerful computer located in the server room of a cloud service provider and we can use this remote computer to render and encode the final high-CPU video.
As has been previously reported in the literature that improved video coding and transcoding administrations will boost the main sectors of entertainment, tourism, technology-based education, and healthcare. New codecs are being created, such as AV1, to follow new requirements for high video resolution with bandwidth and performance constraints. However, these existing codecs have high computer specifications, necessitating the development of new techniques to speed up their encoding. Cloud computing has a variety of appealing characteristics, including on-demand resource allocation, multiple durability, elasticity, and resiliency. Video coding and transcoding facilities are suitable for these infrastructures as it enables resource adaptation, high availability, and all-embracing access to workloads. This paper begins with a short review of the literature regarding a fault-tolerant cloud-based video-coding distributed approach, based on an elastic pool of employees and media servers. Distributed software is developed in addition to the Architecture to split the video coding process into several activities that can be dynamically transferred to the elastic workplace. For three well-known video codecs: H.265, VP9, and AV1, the proposed approach is tested in terms of scalability, resource use, and job allocation while varying the number of employees. Furthermore, the encoded video output was assessed for various bit rates and amount of frames per job using complete reference metrics such as PSNR, MS-SSIM, and VIF. The findings suggest that the researcher’s approach achieves comparable consistency and bitrate to complete video coding while reducing overall encoding time depending on encoder and personnel, by more than 90 percent [40].
4K and 8K video streaming and service delivery issues
One of the exact trials fronting the video broadcasting world is to bring good video content in 4K and 8K set-ups from point A to B at imperative distances [119]. Since the program bandwidth of a positive link is fixed say cable or cellular net it’s obligatory to push more and better quality video data to feasting devices like TVs, tablets, and smart mobile phones. An important solution is the High Efficacy Video Coding (HEVC) products line which dual the strength ratio and enable 55% or more in ratio band savings associate with today’s H.265 format [127]. Various technologies improve video consumption knowledge, plus High-Dynamic-Range (HDR) encoding, statistical multiplexing (Stat Mux), and Adaptive Bit Rate (ABR) encoding.
Study this scenario a live event is being captured by a 4K (or advanced) production camera and on the other close we need to efficiently bridge the video content delivery from end-to-end. Shows how encoding and decoding technologies and products can enable this progressive ability.
This has been discussed by a great number of authors in the literature that with the help of 1 Gbps internet speed, IoT-powered smartphones, computers, and communication, as well as their convergence, are making our homes smarter. This is accomplished using fiber-to-the-home (FTTH) technologies. Two FTTH and IoT-enabled Smart Home issues are being worked on by the author. Millimeter-wave 28 GHz radio relaying of fiber signals to fiber-unreachable households is a solution to FTTH problem 1. Edge computing in the home, along with local regulation of NB IoT-based smart bulbs, is a step toward addressing Problem of Internet distribution in the home. The paper attempts to include all three IoT, Computer, and Communication components in this manner [17].
Starting from the source is the news van with a high-performance encoder built into its equipment, which is capturing the live video footage and up-linking the HEVC content to the satellite TV [27]. A Socialist encoder can compress a 4K 61P real-time HEVC video using only a quarter rack units [143]. To do this needs just, which can be a 90% power saving compared to similar solutions. Also, the single-channel HEVC content is communicated using only half the high frequency. Required with the straight H.264, 7.6 versus 16 Mb its/s [11]. Also, numerical multiplexing software can analyze video streams and maximize bandwidth allocation so that multi-channel content can be uplinked simultaneously to the satellite. Such an instantaneous uplink saves broadcasters time and money as opposed to relying strictly on the constraints of the uplink’s high frequency. After the wonderful head-end facility downloads the happy, it’s edited and encoded using an HEVC video encoder. The content may go to a minor head-end where transcoding takes place using an encoder and IRD (integrated receiver/decoder). When the video is ready for transfer, it’s then transmitted via cable, satellite TV, or live media flowing.
Media service providers such as Amazon, Netflix, as well as cable networks, can further process the content using new generations of codecs. Finally, it’s delivered to end clients. Most newscasters imagine a steady evolution from HD technology through 4K and onto 8K in a relatively flattened time.
A recent study explores that the newly launched Omnidirectional Media Format (OMAF), which defines the supply of 360° video con resounding, embraces only ERP and cubema projections and their local video encoding limitations to the full 4K resolution (e.g. 4,096 = 2.048). 4K ERP content can only be displayed with a small viewing resolution that is less than the resolution of many existing head-mounted displays (HMDs). For this purpose, the availability of 360° video content above 4K resolution must be allowed to take maximum advantage of high-resolution HMDs. We suggest two separate 6K (e.g., 6,144 = 3,072) and 8K (e.g., 8,192 * 4,096) packaging systems for tile-based streaming and the implementation of ERP content while compliance with 4K encoding restriction and High-Performance Video Coding Norm. 6K and 8K usable resolution at the viewport is available in the planned packaging schemes. Our proposed testing technique shows that the proposed layouts substantially lower streaming bit rates relative to 4K ERP’s adjustable mixed content viewport streaming. Our findings also show that 8K-efficient packaging results in 6K efficient packaging, especially in high-quality videos [100].
Advantages of 4K
4K pictures have more detail
Today we can see the commonest benefit of 4K is the detail and intelligence of the picture [103]. It is because the 4K pixel count is around 3841 × 2160, which is almost four times that of HD pictures [90]. 4K pictures usually have sharpness which HD picture lacks. Every aspect of materials is focused prominently. You enjoy watching such detailed pictures. Unfortunately, HD pictures do not give you such amazing pictures. Every pinch of things is focused on a detailed pattern that engages us to watch it. Hence 4K has many more advantages than HD pictures.
Better image depth
The more you watch the 4K pictures you explore more and more advantages of it [105]. Benefits are many, which we cannot count. It has thousands of advantages, which HD pictures can’t handle to show us. Its prominent feature is the depth of the picture that it enhances in the best way the more we zoom the more the depth of it. 4K pictures can attract people towards themselves because the enhancement power is much greater than that of HD pictures. Those who were used to watching HD pictures are now shifted to 4K pictures because of its depth of pictures.
Better color handling
Another feature of 4K pictures is the detailed blending of color [36]. Those who zoom in on the pictures are much aware that the 4K pictures can sharpen the color, enhance its blending power. The pictures, which the audience wants to, see much be of such colors that must attract them so the pictures of 4K can attract such audience towards itself. 4K pictures have grabbed up the attention of the audience in a fraction of time [117]. HD pictures failed to show such ability to attract which 4K pictures have. If we zoom into pictures of 4K the ability of colors to show has the power to line up the audience for it. Hence masses are converting into 4K pictures than HD pictures.
Disadvantages Of 4K video delivery issues
There are some disadvantages as well that 4K pictures sometimes are doubtful as many people do not want a giant television [106]. Some masses experienced that it is not available in small sizes which would definitively help them. But as we know that 4K focuses on the big size of the image which would not possible on small size television.
8k Advantages
The picture elements total individually not the only ground-breaking screen ability to available in Samsung Q LED 8K TV with HDR giving phenomenal level of contrast and brightness so that the audience can dissolve themselves in the amazing scene [104]. There was a hint that Xbox’s upcoming top-secret Project Scarlett could offer 8K resolution but that has been debunked (it’s more likely to be enhanced for 4K). Netflix and its streaming are getting into the business of 4K content so if there’s no 8K contented to watch then what’s the idea.
Disadvantages of 8k Video Delivery Issues
No actual 8K content, 8K mostly offers YouTube. And this is a video of the countryside. The film industry is not interested in this format for two reasons it is not needed in 8K cinemas and then if you take a zoom of the faces of the actors you will see all the hives and flaws on the faces, this is not beautifully pleasing and the audience will not like it.
Mostly 8K TVs are used to play videos of other UHD or FHD set-ups, performing image measuring out [97]. It is not possible to transfer 8K video in the real format without compression to a TV, the same YouTube uses the VP9 codec and any compression is a loss of quality [7,116]. Many will say why not, the new HDMI format is adopted. But over HDMI 2.1, you can transfer 8K at 30 frames per second without compression [24]. To convey 8K video with a higher rate of recurrence the video stream must be compressed with codecs, HDMI does not have enough bandwidth.
LAN port with a speed of 2000 Mbps its speed is enough only for compressed video [21]. To transfer uncompressed 8K, you need a haven with a speed of 20 Gbit per second, but it is still expensive [99]. Televisions have not yet connected optics to the Internet perhaps in televisions optical ports will appear in a couple of years [49].
Most audiences do not see the difference between 4K and 8K when watching TV from a standard remoteness. The human eye simply cannot distinguish pixels. Only people with 100% vision decide 4K from 8K and then they estimate the difference in image quality as insignificant.
Challenges & limitations
Millions of people have been stranded at home due to the pandemic, and they are using the Internet more than ever for education, service, and entertainment, which involve live streaming. Due to the lack of technological advancement in the least developed countries, the sudden increase in internet traffic has slowdown the country’s progress of overall Internet [33]. However, the comparatively slow internet speed of broadband and cellular networks in the least developed countries is due to constraints in overall usable bandwidth per location, poor penetration of optical fiber cable infrastructure, difficulties in telecom infrastructure rollout, and the telecom sector’s unstable state [101]. Secondly, high prices of fast networks and smaller returns in another limitation, which is affecting video streaming [74]. Moreover, even when there are high-speed connections available, many people here simply cannot afford either the devices required or the account access [47]. People have lower levels of earnings and discretionary income, and the amount of money available to the general public to spend on anything like Internet access is very small. Third-world countries impose Internet access limits and it’s a common practice that they impose penalties for breaching them. Moreover, to save money, most people tend to use old models of computing devices. Previous models of computers did not support 4K / 8k video streaming, however, are not readily available in markets [54]. The cost of new equipment is very high due to which people are reluctant to use it.
It has been acknowledged by several studies that in terms of the roadmap for content distribution systems, 5G technology aims to transmit both video and broadband transmission across the shared framework of the telecommunications network. 5G networks that offer greater flexibility, bandwidth, and less lag than existing technology would benefit streaming services. The most widely used streaming technology, such as Dynamic Adaptive Streaming through HTTP (MPEG-DASH), still necessitates a ten-second high latency to ensure a good user experience (QoE). For Live Latency events, this renders MPEG-DASH unfavorable in contrast to a traditional pipeline. To deliver live television services over 5G networks, streaming technology latency must be improved. The broadcast industry adopted a Chunked Common Media Application Format to achieve a latency of less than a second (Chunked CMAF). We show a real Chunked CMAF deployment for MPEG-DASH in this article. To further assess the benefits of CMAF, we examine the QoE results for the projection of legacy MPEG-DASH information vs CMAF-driven content [133].
People are unaware of the current video streaming developments and are unconcerned about the future of streaming as well [76,85]. There are few devices on the market that support 4K/8k. People in the least developed cannot afford fast internet packages, so 4/8K web streaming remains a major challenge [137]. Furthermore, since these machines have restricted storage space, consumers are unable to download high-quality content. Well before the launch of 4K, getting good bandwidth was a challenge. With 4K came a slew of new problems, and now, with 8K and higher resolutions, there are many more. The distribution of 4K or 8K content demands a significant amount of bandwidth [88]. Given this, the bandwidths are slightly shorter than HD (1080p) which is considered another challenge.
Open research issues
3D videos are mostly based on simulated objects, so visibility required high-quality image composition, which increases the overall size of videos [57]. The streaming of video on the internet required compressed or low-size videos, so design and development of a new video compression method required, which highly compressed the videos for streaming over the slow speed networks [109]. During the development of 3D videos size of videos must be in consideration for network streaming, region of interest (ROI) compression methods also can be applied for compression of 3D video streaming [72,147]. The industry also required the design and development of new video compression technologies onwards to H265 to compress more efficiently without compromising video quality for storing and streaming videos.
New applications for online video editing is required, which supports all type of video compression formats for proper editing and without compromising the video quality and considering network for streaming [77]. A better idea is to add features in the application, which automatically informs the editor about the network bandwidth required for streaming the video quality, which you select for the final video file. New algorithms based on the machine/deep learning will be designed for fast compression and decompression required, which automatically adjust compression rate without degrading the video quality for clients according to their network conditions [148].
Cloud computing provides infrastructure for storage for videos, but still, the upload and download speed of the cloud varies according to the distance from users to the cloud [61,66]. The research required to know the quality of experience of using video processing at cloud side, such as game rendering and delivery to users and video editing tools facility for online editing and composition of videos [70]. Video security and copyright are also a big issue in a cloud environment, new movies and series are also shared by users which will become a loss of revenue for a video creator [5,126]. So still algorithms required which recognized the copyright of videos and restrict for further sharing and download [145].
4K and 8K video streaming is a major challenge for service providers, the network speed and processing devices are not available at low cost and every user cannot be offered to use this technology [50]. In the future development of new video file formats will able to make streaming of 4K and 8K videos on the internet, network speed also will be increased with the installation of 5G [98,108]. The research required the development of new devices at low prices, which have 4K and 8K video processing, and everyone afforded to purchase.
The QoE of 2D and 3D videos were assessed by several researchers and the service delivery framework was also given, but development cloud-based automatic framework required which collect subjective and objective data of user’s need and monitor real-time network environment to find the problem and send a report to user and administrator [62,71]. QoE of video streaming protocols were not analyzed yet which streaming protocol with compression technique provide a better experience to users and still research required to solve these issues and provide recommendations to the organization for a better combination of video streaming protocol and file format [141]. The QoE assessment of 4K and 8K was still an open research area for the researcher to analyze will develop a video streaming framework for quality of service delivery of these high definition videos [142].
Conclusion
In this paper, we survey the video streaming compression technologies and protocols which help to stream in the form 2D, 3D, 4K, and 8K. Further, we analyze cloud video processing and limitations and challenges, which service provides face to stream video with QoS from cloud to client. This paper provides the previous start of art technologies for video streaming and the future of streaming. The open research issues were also given for future design and development of new video compression technologies, protocols for smooth streaming of videos on the Internet.
Conflict of interest
The authors do not have any conflict of interest to report.
