Open Access Published by De Gruyter April 25, 2018

Sparse Decomposition Technique for Segmentation and Compression of Compound Images

V.N. Manju and A. Lenin Fred

From the journal Journal of Intelligent Systems

https://doi.org/10.1515/jisys-2017-0360

Abstract

Compression of compound records and images can be more cumbersome than the original information since they can be a mix of text, picture and graphics. The principle requirement of the compound record or images is the nature of the compressed data. In this paper, diverse procedures are used under block-based classification to distinguish the compound image segments. The segmentation process starts with separation of the entire image into blocks by spare decomposition technique in smooth blocks and non smooth blocks. Gray wolf-optimization based FCM (fuzzy C-means) algorithm is employed to segment background, text, graphics, images and overlap, which are then individually compressed using adaptive Huffman coding, embedded zero wavelet and H.264 coding techniques. Exploratory outcomes demonstrate that the proposed conspire expands compression ratio, enhances image quality and additionally limits computational complexity. The proposed method is implemented on the working platform of MATLAB.

Keywords: Compound images; segmentation; sparse decomposition

1 Introduction

Currently, with the advent of new technology, high-speed internet and need for high amount of data storage, image compression is one of the most important tasks. Additionally, medical science increasingly requires huge amount of images to be stored digitally, most of which are generally grayscale images. Furthermore, in wireless sensor networks, where low-power devices are deployed, image compression techniques are required for reducing power consumption, transmission time and failure probability [12]. In JPEG2000, there is a choice of two discrete wavelet filters: the filters can be lifted (factorized) in order to speed up the convolution step. The 9/7 filter is chiefly suited for high-visual-quality compression. The use of floating-point arithmetic in the discrete wavelet transform and the associated rounding errors make it unsuitable for strictly lossless compression. The filter has, however, better de-correlation properties than the shorter 5/3 filter and hence a better compression performance [7]. The bit-allocation approaches did not change the coding structure, they simply gave more bits or finer quantization steps to the text/graphic areas. However, these approaches could not deal with cases where most of the image was text/graphic region [18].

Usually, one image is first partitioned into some square blocks in FIC, and then these square blocks compose a set called the pool. According to two types of size, an image is partitioned into two dissimilar pools. The pool composed by the blocks with larger size is called the domain pool, and the other pool is called the range pool. The cells in the range pool are the blocks to be encoded [15]. Several digital watermarking algorithms have been proposed with different contributions. The goal is to embed the watermark that is imperceptible in the image, while the copyright holder can detect its existence, using proper private data key. Roughly speaking, these watermarking schemes can be categorized by casting/processing domain, signal type of watermark and hiding position. There are two dispensation domain categories: the spatial domain and the transform domain. In contrast to the spatial domain-based method, transform domain-based methods can embed more bits of the watermark and have better robustness against the attacks such as noise, JPEG compression and Gaussian low-pass filter. Therefore, it becomes one of study focuses in this community [9]. The image to be compressed is grayscale with pixel values between 0 and 255. Compression refers to reducing the quantity of data used to represent a file, image or video content without excessively reducing the quality of the original data. It also reduces the number of bits required to store and transmit digital media. Compression could be defined as the process of reducing the actual number of bits required to represent an image [13].

There are some stringent requirements for the compression of video-like screen contents considering the screen-sharing scenarios. First, the screen compression schemes should guarantee high-fidelity display, especially on the textual contents for visual experience. Second, the screen compression algorithms should achieve a high compression ratio to fulfill the network requirements [11]. Real-time high-quality compressed screen image transmission can also be used in many recent proposed applications, such as cloud, cloudlet screen and cloud mobile computing. In cloud computing, data transfer bottleneck is an obstacle. If the computer screen image can be transmitted in real time and with high quality from the host-to-user site, the data transfer bottleneck problem in cloud computing can be solved [14].

In this article, spare decomposition method for splitting image into smooth and non-smooth blocks and gray wolf-based FCM (GW-FCM) optimization for cluster has been proposed. Different coders have been employed for compressing specific type of image pixels. Adaptive Huffman coder has been employed to compress smooth blocks or background, EZW coder to compress text pixels and the H.264 method has been employed for graphics and overlap pixels. These coder’s forms compression block and yields compressed image of compound input image. The technique has been implemented on MATLAB platform for performance and efficiency analysis. The organization of the paper is prearranged as literature survey in Section 2, block diagram of proposed methodology in Section 3, the spare decomposition method in Section 3.1, gray wolf-based FCM in Section 3.2, compression coders in Section 3.3, results and discussion in Section 4 and conclusion in Section 5.

2 Literature Survey

Gueguen [6] proposed a new compact representation for the fast query/classification of compound structures from very-high-resolution optical remote sensing imagery. This bag-of-features representation relies on the multiscale segmentation of the input image and the quantization of image structures pooled into visual word distributions for the characterization of compound structures. A compressed form of the visual word distributions was described, allowing adaptive and fast queries/classification of image patterns. The proposed representation and the query methodology were evaluated for the classification of the (University of California) UC Merced 21-class data set, for the detection of informal settlements and for the discrimination of challenging agricultural classes.

Ebenezer Juliet et al. [1] proposed a new compound image segmentation algorithm based on the mixed raster content model (MRC) of multilayer approach (foreground/mask/background). The algorithm first segments a compound image into different classes. Then, each class was transformed to the three-layer MRC model differently according to the property of that class. Finally, the foreground and background layers were compressed using JPEG 2000. The mask layer was compressed using JBIG2. The proposed morphological-based segmentation algorithm designs a binary segmentation mask that accurately partitions a compound image into different layers, such as the background and foreground layers.

Yang et al. [17] proposed a scale and orientation invariant grouping algorithm to adaptively generate textual connected components (TCCs) with uniform statistical features. The minimum average distance and morphological operations were employed to assist the formation of candidate TCCs. Then, three string-level features (i.e. sharpness, color similarity and mean activity level) were designed to distinguish the true TCCs from the false-positive ones that are formed by connecting the high-activity pictorial components. Extensive experiments showed that the proposed framework can segment textual regions precisely from born-digital compound images while preserving the integrity of texts with varied scales and orientations and avoiding overconnection of textual regions (Gnana et al. [2]).

Grailu [5] used the set partitioning in hierarchical trees (SPIHT) coder in the framework of ROI coding along with some image enhancement techniques to remove the leakage effect that occurred in the wavelet-based low-bit-rate compression. They evaluated the compression performance of the proposed method with respect to some qualitative and quantitative measures. The qualitative measured include the averaged mean opinion scores (MOS) curve along with demonstrating some outputs in different conditions.

Kurbana et al. [8] used well-known evolutionary algorithms such as evolution strategy, genetic algorithm, differential evolution, adaptive differential evolution and swarm-based algorithms such as particle swarm optimization, artificial bee colony, cuckoo search and differential search algorithm to solve multilevel thresholding problem. Kapur’s entropy was used as the fitness function to be maximized. Experiments are conducted on 20 different test images to compare the algorithms in terms of quality, running CPU times and compression ratios (Gnana et al. [3]).

Yang et al. [16] have studied the subjective quality evaluation for compressed DCIs and investigated whether existing image quality assessment (IQA) metrics are effective to evaluate the visual quality of compressed DCIs. A new compound image quality assessment database (CIQAD) was therefore constructed, including 24 reference and 576 compressed DCIs. The subjective scores of these DCIs were obtained via visual judgement of 62 subjects using paired comparison (PC) in which the Hodgerank decomposition was adopted to generate uncompleted but near-balanced pairs (Gnana et al. [4]). Fourteen state-of-the-art IQA metrics are adopted to assess quality of images in CIQAD, and experimental results indicate that the existing IQA methods were limited in evaluating visual quality of DCIs.

Zhu et al. [19] analyzed the characteristics of screen content and coding efficiency of HEVC on screen content. They proposed a new coding scheme, which adopts a non-transform representation, separating screen content into color component and structure component. Based on the proposed representation, two coding modes were designed for screen content to exploit the directional correlation and non-translational changes in screen video sequences. The proposed scheme was then seamlessly incorporated into the HEVC structure and implemented into HEVC range extension reference software HM9.0.

Minaee and Wang [10] have proposed a model that uses the fact that the background in each block was usually smoothly varying and can be modeled well by a linear combination of a few smoothly varying basis functions, while the foreground text and graphics create sharp discontinuity. The algorithms separated the background and foreground pixels by trying to fit background pixel values in the block into a smooth function using two different schemes. One was based on robust regression, where the inlier pixels will be considered as background, while remaining outlier pixels will be considered foreground. The second approach used a sparse decomposition framework where the background and foreground layers are modeled with smooth and sparse components, respectively.

3 Proposed Segmentation-based Compound Image Compression Method

3.1 Compound Image

The most common and significant form of human communication today in “paperless office” is paper documents. These documents are shaped via computers and are stowed in electronic form. The obstacle faced, even with the help of electronic documents, is that they can be quite large in size. Electronic document images have mixed content types such as text, background, and graphics in both grayscale and in color form; they are labeled as “compound images”. There are various practices available for compressing compound images. Thus, to progress the compression ratio, a joint segmentation-based compression technique is introduced.

In this proposed process, the sparse decomposition-based compression technique is employed. The sparse decomposer technique will segment the smooth and non-smooth constituents. Next, gray wolf optimization-based FCM procedure is engaged to segment the text, overlap and graphic region. Finally, the adaptive Huffman coder, EZW coder and H.264 coding methods were employed for the compression.

The flowchart of the proposed method is depicted in Figure 1.

Figure 1:

Flowchart of the Proposed Method.

3.2 Spare Decomposition Technique

The proposed sparse decomposition method is engaged to segment a provided image into two layers, background and foreground. The background holds the smooth part of the image and can be well signified with a few smooth basis functions, whereas the foreground holds the text graphics and overlaps that cannot be signified with a smooth model. With the help of the fact that foreground pixels typically subjugate a small percentage of the images, we can model them with a sparse constituent overlaid on top of the background. Consequently, it makes sense to think of the mixed content image as a superposition of two layers, one smooth and the other one sparse. Henceforth, we can use sparse decomposition methods to separate these mechanisms.

We first need to descend the appropriate model for background component. We divide each image into non-overlapping blocks of size M×M, signified by f(a, b), where a and b represent the horizontal and vertical axes, respectively. Then, it is characterized as a sum of two components F=A+S, where A and S signify the smooth background and sparse foreground components, respectively. The background is demonstrated as a linear combination of m basis functions:

(1) ∑k=1kxmZm(a, b)

where Z_m(a, b) denotes a 2D smooth basis function and x₁…x_k signify the parameters of this smooth model. As this model is linear in parameters x_k, it is simpler to find the optimal weights to find Z_m(a, b). All the possible basis functions are well-ordered in the conventional zig-zag order in the (u; v) plane, and the first K basis functions are selected. Henceforth, each image block can be signified as

(2) f(a, b)=∑k=1kxmZm(a, b)+S(a, b)

where ∑i=1kxiZi(a, b) and S(a, b) represent the smooth background region and foreground pixels, respectively.

To have a more compact notation, we convert all two-dimensional (2D) blocks of size M×M into vectors of length M² represented by f and S. f(a, b) and S(a, b) are shown in Equation (1) Z_m, where Z is a matrix of size M²×m, in that the M^th column resembles the vectorized version of Z_m(a, b). Equation (2) can be written as

(3) f=Zm+s

To neglect the decomposition issue, we need to impose some priors on m and s. In this method, three priors are forced sparsity of m, sparsity of the foreground and connectivity of the foreground. The reason for arresting sparsity is that we do not want to utilize too numerous basis performances for background representation. Without such a constraint on the coefficients, we might end up with the situation in that all foreground pixels are also modeled using the smooth layer. The second prior, sparsity of foreground, is stimulated from the fact that the foreground pixels are anticipated to inhabit a small percentage of pixels in each block. The last but not the least point is that we supposed the nonzero constituents of the foreground to be linked to each other and we do not want to have a set of isolated points perceived as foreground. We can add a group sparsity regularization to endorse the connectivity of the foreground pixels. We can incorporate all these priors in an optimization issue as shown below:

(4) minimizes,m= ||m||0+ η1||s||0+ η2G(s)

where η₁ and η₂ are the weights for regularization terms that requirement to be tuned, and G(s) shows the group sparsity on the foreground.

(5) G(s)=∑m||sgm||2

where, gm represents the M^th group. At this time, the overlapping groups are concerned which consist of all columns and rows in the image. Consequently, we can display the group sparsity term as the summation of two terms, one over all columns and the other one over all rows of image as

(6) G(s)=∑m∈rows||sgm||2+∑n∈columns||sgn||2

3.3 Gray Wolf-based Fuzzy C-means Optimization

Clustering is the procedure of transmission a homogeneous group of objects into subsets known as clusters, so that objects in each cluster are more analogous to each other than objects from dissimilar clusters on the basis of the values of their attributes. For handling random distribution data sets, soft computing has been announced in clustering that exploits the tolerance for imprecision and uncertainty in order to accomplish tractability and robustness. Fuzzy sets and rough sets have been assimilated in the C-means outline to develop the fuzzy C-means (FCM) and rough-means (RCM) algorithms.

Assume X=(x₁, x₂,…, x_N) as the universe of a clustering data set, G=(γ₁, γ₂, γ₃, ……, γ_C) as the prototypes of C clusters, and M=[m_lm]_N_×_C as a fuzzy partition matrix, where u_lm∈[0, 1] is the membership of x_l in a cluster with prototype γ_l; x_l, γ_l∈R^P, where P is the data dimensionality, 1≤l≤N, and 1≤j≤N. The FCM algorithm is consequent by diminishing the objective function.

(7) LFCM(M, γ, x)=∑l=1C∑m=1Nmlmzdlm2(xlγm)

where z>1 is the weighting exponent on each fuzzy membership and d_im is the Euclidian distance from information vectors x_l to cluster center γ_l.

(8) ∑m=1cmlm=1 ∀l=1, 2, ...., N0<∑m=1Nmlm<N ∀m=1, 2, ...., Cdlm= ||xl−γm||

Using the wolves’ strategy, i.e. gray wolves encircle their prey at the time of the hunt, the centroid is initialized with the help of gray wolf optimization and the degree of membership is calculated for all the feature vectors in all the clusters. To mathematically model encompassing behavior, the subsequent equations are proposed:

(9) D→= |C→⋅Xp→(t)−X(t)→|

(10) X→(t+1) = X→p (t)−A→ · D→

where t represents the current iteration, A→ and C→ are coefficient vectors, Xp→ is the position vector of the prey and X→ is the position vector of a gray wolf. Vectors A→ and C→ are measured as

(11) A→=2a→⋅r→1⋅a→

(12) C→=2r→2

where components of a→ are linearly diminished from 2 to 0 over the course of iterations and r₁ and r₂ are random vectors in [0, 1].

Therefore,

(13) Mlm=(1/d2(xl, γm))1/(n−1)∑m=1c(1/d2(xl, γm))1/(n−1)

The new centroid is prearranged as:

(14) γ^m=∑l=1N(Mlm)nxl∑l=1N(Mlm)n

Update the degree of membership M_lm using Equation (13).

If, maxml|Mlm−Mlm| <ε stop, then compute the novel centroid again, where ε∈(0,1) is the termination criteria. The FCM permits each feature vector to belong to every cluster with a membership between 0 and 1 that is computed by Equation (13). The foreground of the image is segmented into three clusters with dissimilar features, based on the features text, graphics and overlap clusters are engendered.

3.4 Compound Image Compression

Image compression is mostly used to decrease the trivial and redundant part of the image information, and to deal with such kind of images, it is important to recognize layout and structural data from the image, and informal effective compression methods that will be feasible for the dissimilar content types of the image. Such methods are designated as document image compression approaches. The chief aim of using the compression method is to accomplish low-complexity and high-compression ratio in order to meet the energy restrictions of image processing without any loss of the original data [2].

Image compression deliberates minimization of storage space as its chief objective and the decompressed image after compression should be the exact replica of the original image [3]. Thus, it is essential to select the compression methods for compressing the segmented image blocks.

In our proposed technique, the background image having a smooth region is compressed by means of adaptive Huffman coder; however, the text region is compressed using EZW coder, and the graphics and the overlapping regions are compacted with the H.264 coder technique. It is always obligatory to have a coder that codes the images in analogous technique. So that it has the aptitude to process the resulting illustration directly are predicted.

3.4.1 Adaptive Huffman Coder

The Huffman code describes word schemes that compute the mapping from source messages to the codeword based on a running assessment of the source message probabilities. The code is adaptive and dynamic so as to endure the relevance for the current assessments [4]. In this method, the adaptive Huffman codes rejoin to locality. In core, the encoder is “learning” the features of the source. The decoder must learn together with the encoder by continually updating the Huffman tree so as to stay in synchrony with the encoder. One more benefit of the scheme is that they necessitate only one pass over the information.

The adaptive Huffman algorithm comprises two enhancements over conventional Huffman algorithm. First, the number of interchanges in that a node is moved upward in the tree throughout a recomputation is restricted to 1. This number is conservatively bounded only by l/2, in which l is the length of the codeword for x(t+1) when the recomputation begins. Second, the adaptive technique diminishes the values of SUM{l(k)} and MAX{l(k)} subject to the obligation of diminishing SUM{w(k) l(k)}. Over the whole message, adaptive approaches that do not undertake the relative frequencies of a prefix signify accurately the symbol probabilities.

An assumption is made in adaptive coding that the weights in the current tree are parallel to the probabilities related to the source. This assumption becomes more operative as the length of the ensemble increases. Under this assumption, the anticipated cost of transmitting the subsequent letter is SUM{p(k) l(k)}, which is roughly SUM{w(k) l(k)}.

Principally, all nodes are roots of their own degenerate tree of only one leaf. The algorithm combines the trees with the least probability first and repeats this process until only one tree is left. The function of adaptive Huffman’s algorithm is bounded with the help of S−n+1 from below and S+t−2n+1 from above. At worst, the adaptive technique can diffuse one more bit per codeword than the conventional Huffman technique.

3.4.2 Embedded Zero Tree Wavelet Coder

The embedded zero tree wavelet algorithm (EZW) is a simple, hitherto remarkably efficient, image compression algorithm, having the property that the bits in the bit stream are produced in order of priority, attaining a fully embedded code. The embedded code signifies a series of binary decisions that differentiate an image from the “null” image. With the help of an embedded coding algorithm, an encoder can dismiss the encoding at any point, thereby permitting a target rate or target distortion metric to be met precisely. EZW reliably produces compression results that are competitive with virtually all known compression algorithms on typical test images.

The EZW algorithm based on the basis of four important aspects:

Finding discrete wavelet transformation
Forecasting the absence of large amount of data across scales by using the self-similarity essential in images
Entropy-coded successive approximation quantization
“Universal” lossless data compression that is attained via adaptive arithmetic coding.

Each wavelet coefficient at a provided scale can be associated with a set of coefficients at the subsequent finer scale of similar orientation. Zero tree root (ZTR) is a low-scale “zero-valued” coefficient, in that all the associated higher-scale coefficients are also “zero-valued”. Requiring a ZTR permits the decoder to “track down” and zero out all the related higher-scale coefficients. Trees well-defined using the wavelet is provided in Figure 2A and conforming compression is prearranged in Figure 2B.

Figure 2:

Roots of the Three Trees and Compression.

(A) Representing well defined trees using wavelet and (B) represents prearranged compression.

Zero trees are the main portion of EZW; they are not the only important portion. The other portion has to do with embedded coding. The aim of embedded coding is to generate a bit stream that can be abridged at any point using the decoder.

3.4.3 H.264 Coding Technique

To acclimatize to the H.264 intra frame coding, the two intra coding modes can be industrialized: RSQ (residual scalar quantization) and BCIM (base colors and index map).

RSQ mode: For graphics blocks enjoying edges of numerous directions, intra prediction along a single direction cannot entirely remove the directional correlation among samples. After intra prediction, strong anisotropic correlation still remains reserved. In such cases, it is not effective to achieve a transform on them. One technique is to skip the transform and straight code prediction residues that are similar to traditional pulse-code modulation (PCM).

Let R be the rate, the coding gain is well defined as the ratio of distortions on transform coefficients and residual samples, individually, as

(15) Gpcm/Tc=DTcDpcm

(16) Dpcm=εs2σs22−2R

(17) DTc=(∏m=0k−1εtσtm)1/k2−2R

where ε is the factor of probability distribution, σ is the variance, R is the rate, D_Tc is the distortion using transformation coefficient and D_pcm is the distortion of PCM.

For an agreed prediction direction, only the nearest reconstructed integer pixel along that direction without filtering is utilized for prediction. The reason is that the filtering achieved on reconstructed pixels would blur sharp edges in text and graphics blocks and decrease the prediction accuracy. Thus, in Equations (15), (16) and (17), the reconstructed nearest integer pixel that is elected for prediction will have a distance from the current pixel farther than amalgamations of reconstructed pixels.

BCIM mode: The overlap portions on compound images have limited colors but complex patterns. Such blocks can be characterized succinctly by numerous base colors along with an index map. It is rather like color quantization, which involves selecting an illustrative set of colors to estimate all the colors of an image. At this time, we primarily got the base cluster of the blocks with the help of a clustering algorithm. Let us use the luminance plane of a 16×16 overlap block as an example, with each sample expressed by 8 bits. If four base colors are nominated to approximate colors of that block, only two bits are needed to signify each sample’s index without compression.

4 Results and Discussion

The anticipated methodology was applied with the help of MATLAB and was assessed by testing the proposed system with an input compound image. To authenticate the performance of the proposed algorithms, experimental assessment is achieved with the help of a variety of images. For experimentation, compound images such as text, image and background test images are used.

The anticipated approach was executed with the help of MATLAB and was evaluated by testing the proposed conspire with an information compound picture. With a precise end goal to authorize the accomplishment of anticipated calculations, test calculation is performed with the help of a collection of pictures. For investigation, compound images were used. It includes text, image and background images.

4.1 Performance Metrics

The proposed approach achieves better compression ratio, decompression time, memory estimate and is contrasted by existing pressure procedures as far as the formerly stated execution measurements.

4.1.1 Peak Signal-to-Noise Ratio

The peak signal-to-noise ratio (PSNR) is defined as the ratio between the maximum possible power of a picture and power of corrupting noise. A high PSNR legalizes the decompressed picture have great quality. The PSNR is figured using the equation

PSNR=10log10(MAX2MSE)

The compressed and decompressed content/design pieces and picture/foundation squares can be tested with the help of compression ratio. In advanced picture preparation, the compression ratio is considered as the proportion of size of original image in grayscale to the span of decompressed picture. The proposed agenda provides the aggressive compression ratio: compression ratio=(size of original image)/(size of compressed image).

4.1.2 Mean Square Error

The mean square error (MSE) is the measure of average of squared ratio of estimator output to the estimated output. It is denoted as:

MSE=(1s∗t)∑r=0s−1∑p=0u−1[E(r,p)−H(r,p)]2

4.1.3 Root Mean Square Error

The root mean square error (RMSE) (similarly known as the root mean square deviation, RMSD) is defined as the measure of contrast within qualities predicted by a model and the qualities really saw from the condition that is being presented. These individual contrasts are similarly known as residuals, and the RMSE serves to total them into a solitary measure of prescient power.

The RMSE of a model expectation concerning the evaluated variable is considered as the square foundation of the mean square error:

RMSE=∑s=1n(Aobs,i−Amodel,s)2c

where A_obs is the observed value and A_model is the modeled value at time/place s.

The sample test image is shown in Figure 3.

Figure 3:

Test Images.

4.2 Performance Evaluation

The compression ratio of text, image and background test images (1–5) with the help of future optimization-based FCM and existing multibalanced CS k-means algorithm was represented in Table 1. The compression ratio value of proposed technique is high related to available technique, making it good in condensing compound images.

Table 1:

Compression Ratio of the Text, Image and Background during Segmentation by Optimization-Based FCM and multibalanced CS k-Means Algorithm.

Images	Compression ratio
	Optimization-based FCM				Multibalanced CS k-means				H.264
	Text	Image	Background	Average	Text	Image	Background	Average
1	52.38	58.14	39.56	50.03	25.25	28.78	9.45	21.16	47.28
2	47.12	36.34	12.27	31.91	19.15	29.45	9.96	19.52	24.63
3	49.09	37.43	14.41	33.64	29.78	26.25	7.23	21.0866667	31.50
4	24.96	25.42	11.88	20.7533333	28.97	25.58	7.24	20.5966667	18.23
5	16.95	18.67	9.18	14.93	27.68	25.42	10.32	21.14	12.70

Furthermore, the compression efficacy values of the proposed and available methods were revealed graphically in Figure 4. The figure shows that the compression ratio of the proposed segmentation-based compression strategy works effectively than all other existing methods.

Figure 4:

Compression Ratio of the OFCM and MB CS k-Means Algorithm.

The dissimilar parameter, for example, PSNR, structural similarity (SSIM), RMSE, image quality index, and second derivative-like measure of enhancement (SDME) were established for the test pictures. The performance evaluation of content, picture and background image quality while employing optimization-based FCM segmentation technique is given in Table 2.

Table 2:

Compression Performance Evaluation by Image Quality Assessment for Existing and Proposed Techniques.

Images	Existing technique										Proposed technique
	Multibalanced CS k-means					H.264 (without segmentation)					Optimization-based FCM
	PSNR	SSIM	RMSE	Image quality index	SDME	PSNR	SSIM	RMSE	Image quality index	SDME	PSNR	SSIM	RMSE	Image quality index	SDME
1	32.81472	0.509754	10.15761	0.871886	29.57407	32.534	0.48322	11.2586	0.8565	28.3	33.95966	0.533235	8.366322	0.871886	30.75774
2	33.90579	0.278498	4.970593	0.884218	37.57117	33.3355	0.20034	5.86475	0.8763	36.234	34.60073	0.343332	3.04944	0.892074	39.74313
3	34.91338	0.395751	2.485376	0.922073	38.39932	30.8635	0.4862	2.01854	0.923	43.234	33.61066	0.564935	1.407067	0.958886	46.39223
4	32.12699	0.546882	1.957167	0.931574	44.91293	33.3786	0.34574	3.57858	0.905	37.34	35.75527	0.410348	1.872407	0.964663	39.65548
5	32.63236	0.264889	10.80028	0.919492	33.87352	30.3757	0.23568	11.3865	0.88673	32.121	33.25413	0.297881	9.165764	0.950517	35.17119

The competence of MB CS k-means algorithm-based compression and H.264 video compression method with the help of dissimilar parameters is shown in Table 2. Test picture 1 has a higher PSNR than the alternative pictures. The similarity within pictures is checked with the help of SSIM, and test image 2 has less mutilation distinguished compared with the other test pictures. The RMSE is lower for test picture 3, which was distinguished with other test picture. The picture quality index is valuable for test pictures 4 and 5, which were differentiated with the other test picture. SDME is less narrow to noise and then soak edges in test picture 3 has better enhancement analogized with others test pictures.

The viability of optimization-based FCM using dissimilar parameters is shown in Table 2. Test pictures 2 and 4 have the high PSNR than the alternate pictures. The similarity within the pictures is checked using SSIM, and test images 2 and 5 having less mutilation juxtaposed with dissimilar pictures. The RMSE is lower for test pictures 3 and 4, which were juxtaposed with other test pictures. The picture quality index is valuable for test pictures 3 and 4, analogized with other test picture. The SDME is less touchy to noise and then soak edges in test picture 3 has better upgrade distinguished with others test pictures.

Table 2 illustrates about the performance valuation of dissimilar parameters using available MB CS k-means, H.264 (without segmentation) and proposed optimization-based FCM technique. From the above tables, it is evident that all the parameters are better for the proposed approach than the current methods; thus, our proposed approach is best analogized with available method. Based on the implementation results, H.264 is better than the MB Cs k-means technique. Even though all the existing methods compared in our evaluation produces much lesser results than our proposed technique.

The running time (in seconds) was also examined for the anticipated compression method and the H.264 video compression technique without segmentation, and the values are shown in Table 3.

Table 3:

Running Time of Proposed Segmentation-based Compression and Existing H.264 Compression Method.

Image	Running time (s)
Image	Proposed (with segmentation)	H.264 (without segmentation)
1	28.426253	17.2533
2	1518.380388	1189.48544
3	40.701457	28.75734
4	37.902061	25.26731
5	15.914116	13.63731

In Table 3, it is predicted that the running time obtained for H.264 compression strategy is less when related to the proposed technique. Although the running time is high for the proposed segmentation-based compression approach, the proposed method is substantial in terms of compression ratio.

5 Conclusion

In this article, the compression process is performed via the compression of individual constituents such as image, text and background accomplished from the segmentation with the help of sparse decomposition method and OFCM technique to progress the compression ratio. The proposed segmentation-based compression is associated with the available methods. Furthermore, the proposed technique is juxtaposed with the H.264 video compression strategy without segmentation. After compression, the image quality is investigated via numerous measures such as PSNR, SSIM, RMSE, image quality index and SDME and associated for the available methods. Furthermore, the compression ratio and its running time were also investigated. The investigation shows that the proposed method is operative and offers outstanding compression ratios.

Bibliography

[1] S. Ebenezer Juliet, V. Sadasivam and D. Jemi Florinabel, Effective layer-based segmentation of compound images using morphology, J. Real-Time Image Pr. 9 (2014), 299–314.10.1007/s11554-011-0223-8Search in Google Scholar

[2] G. R. Gnana Kinga and J. H. Jensha Haennahr, Hybrid compression scheme using precoding block and fast stationary wavelet transformation, J. Intell. Fuzzy Syst. 31 (2016), 415–421.10.3233/IFS-162154Search in Google Scholar

[3] G. R. Gnana Kinga and C. Seldev Christopher, Compound image compression using parallel Lempel-Ziv-Welch algorithm, in: IET Chennai Fourth International Conference on Sustainable Energy and Intelligent Systems, Chennai, pp. 522–526, 2013.Search in Google Scholar

[4] G. R. Gnana Kinga and C. Seldev Christopher, Improved block based segmentation algorithm for compression of compound images, J. Intell. Fuzzy Syst. 27 (2014), 3213–3225.10.3233/IFS-141278Search in Google Scholar

[5] H. Grailu, Textual image compression at low bit rates based on region-of-interest coding, International Journal on Document Analysis and Recognition (IJDAR) 19 (2016), 65–81.10.1007/s10032-015-0258-7Search in Google Scholar

[6] L. Gueguen, Classifying compound structures in satellite images: a compressed representation for fast queries, IEEE Trans. Geosci. Remote Sens. 53 (2015), 1803–1818.10.1109/TGRS.2014.2348864Search in Google Scholar

[7] K. Jung and R. Seiler, Segmentation and compression of documents with JPEG2000, IEEE Trans Consum. Electron. 49 (2003), 802–807.10.1109/TCE.2003.1261158Search in Google Scholar

[8] T. Kurbana, P. Civicioglub, R. Kurbanc and E. Besdoka, Comparison of evolutionary and swarm based computational techniques for multilevel color image thresholding, Appl. Soft Comput. 23 (2014), 128–143.10.1016/j.asoc.2014.05.037Search in Google Scholar

[9] J. Lang and Z.-G. Zhang, Blind digital watermarking method in the fractional Fourier transform domain, Opt. lasers Eng. 53 (2014), 112–121.10.1016/j.optlaseng.2013.08.021Search in Google Scholar

[10] S. Minaee and Y. Wang, Screen content image segmentation using robust regression and sparse decomposition, IEEE Journal on Emerging and Selected Topics in Circuits and Systems (IJESTCS), 6 (2016), 573–584.10.1109/JETCAS.2016.2597701Search in Google Scholar

[11] Z. Pan, H. Shen, Y. Lu, S. Li and N. Yu, A low-complexity screen compression scheme for interactive screen sharing, IEEE Trans. Circuits Syst. Video Technol. 23 (2013), 949–960.10.1109/TCSVT.2013.2243056Search in Google Scholar

[12] S. Paul and B. Bandyopadhyay, A novel approach for image compression based on multi-level image thresholding using Shannon entropy and differential evolution, in: IEEE Conference Publications, SIU Maharashtra, pp. 56–61, 2014.10.1109/TechSym.2014.6807914Search in Google Scholar

[13] C. Saravanan and M. Surender, Enhancing efficiency of huffman coding using Lempel Ziv coding for image compression, International Journal of Soft Computing and Engineering ISSN. 2 (2013), 2231–2307.Search in Google Scholar

[14] S. Wang and T. Lin, Compound image compression based on unified LZ and hybrid coding, IET Image Processing 7 (2013), 484–499.10.1049/iet-ipr.2012.0439Search in Google Scholar

[15] J. Wang and N. Zheng, A novel fractal image compression scheme with block classification and sorting based on Pearson’s correlation coefficient, IEEE Trans. Image Process. 22 (2013), 3690–3702.10.1109/TIP.2013.2268977Search in Google Scholar PubMed

[16] H. Yang, Y. Fang, Y. Yuana and W. Lina, Subjective quality evaluation of compressed digital compound images, J. Vis. Commun. Image Represent. 26 (2015), 105–114.10.1016/j.jvcir.2014.11.001Search in Google Scholar

[17] H. Yang, S. Wu, C. Deng and W. Lin, Scale and orientation invariant text segmentation for born-digital compound images, IEEE Trans. Cybern. 45 (2015), 519–533.10.1109/TCYB.2014.2330657Search in Google Scholar PubMed

[18] W. Zhu, O. C. Au, W. Dai, H. Yang, R. Ma, L. Jia, J. Zeng and P. Wan, Palette-based compound image compression in HEVC by exploiting non-local spatial correlation, in: IEEE Conference on Acoustic, Speech and Signal Processing, China, pp. 7348–7352, 2014.10.1109/ICASSP.2014.6855027Search in Google Scholar

[19] W. Zhu, W. Ding, J. Xu, Y. Shi and B. Yin, Screen content coding based on HEVC framework, IEEE Trans. Multimed. 16 (2014), 1316–1326.10.1109/TMM.2014.2315782Search in Google Scholar

Received: 2017-07-21

Published Online: 2018-04-25

This work is licensed under the Creative Commons Attribution 4.0 Public License.