

Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

> Dr. Himadri Mandal Dr. Amit Phadikar

Kripa Drishti Publications, Pune.

# FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

#### Dr. Himadri Mandal

Associate Professor, Dept. of Electronics & Communication Engineering, Calcutta Institute of Technology, Uluberia Howrah W.B., India.

#### Dr. Amit Phadikar

Professor, Dept. of Information Technology, MCKV Institute of Engineering (Autonomous), Liluah, Howrah, W.B., India.

Kripa-Drishti Publications, Pune.

| Book Title: | FPGA Based Reconfigurable Hardware Architecture<br>for Quality Access Control of Images and Coupling in<br>Fiber Optics Communication |
|-------------|---------------------------------------------------------------------------------------------------------------------------------------|
| Author by:  | Dr. Himadri Mandal, Dr. Amit Phadikar                                                                                                 |

Approved by: Yuan Ze University, Taoyuan, Taiwan, R.O.C, Diss., 2018.

1<sup>st</sup> Edition



Published: September 2021

**Publisher:** 



#### **Kripa-Drishti Publications**

A/ 503, Poorva Height, SNO 148/1A/1/1A, Sus Road, Pashan- 411021, Pune, Maharashtra, India. Mob: +91-8007068686 Email: <u>editor@kdpublications.in</u> Web: <u>https://www.kdpublications.in</u>

# © Copyright KRIPA-DRISHTI PUBLICATIONS

All Rights Reserved. No part of this publication can be stored in any retrieval system or reproduced in any form or by any means without the prior written permission of the publisher. Any person who does any unauthorized act in relation to this publication may be liable to criminal prosecution and civil claims for damages. [The responsibility for the facts stated, conclusions reached, etc., is entirely that of the author. The publisher is not responsible for them, whatsoever.]

# **ACKNOWLEDGEMENT**

In our first step of writing this book, we have received a lot of inspiration, encouragement from our near and dear ones. The making of this book is because of continuous labour and team work.

In this regard we are thankful to all our friend and colleagues for their esteemed support and cooperation. We would like to express our sincere thanks and gratitude to Prof. (Dr.) Tien-Lung Chiu (Photonics Engineering Department, Yuan Ze University, Taoyuan, Taiwan), Dr. Goutam Kumar Maity (Physics Department, Pingla Thana Mahavidyalaya, Maligram, Medinipur (W), W.B., India), Prof. (Dr.) Sankar Gangopadhyay (Dept. of Electronics, Brainware University, Barasat, Kolkata-700124, W.B, India) and Prof. (Dr.) Jonathon David White (Dept. of Photonics Engineering, Yuan Ze University, Taiwan).

We also acknowledge our gratitude to the authorities of Calcutta Institute of Technology, Banitabla, Uluberia, Howrah and authorities of MCKV Institute of Engineering, Liluah, Howrah, W.B., India for providing us the necessary facilities and supports.

We also extend our thanks to Kripa-Drishti Publications, Pune for publishing this high-end technical book.

## Dr. Himadri Mandal,

## Dr. Amit Phadikar

# INDEX

| Chapter 1: Introduction and Scope of the Book                                   | 1              |
|---------------------------------------------------------------------------------|----------------|
| 1.1 Introduction:                                                               |                |
| 1.2 Overview of Data Hiding in Digital Media                                    |                |
| 1.2.1 Watermarking Principles:                                                  |                |
| 1.2.2 Watermarking Applications:                                                | 4              |
| 1.2.3 Domain of Implementation:                                                 | 5              |
| 1.2.4 Review of Prior Art:                                                      | 6              |
| 1.3 Overview of Coupling of Laser Diode to Single Mode Fiber:                   | 9              |
| 1.3.1 Background:                                                               | 9              |
| 1.3.2 Various Lensing Schemes:                                                  | 10             |
| 1.3.3 Evaluation of Coupling Efficiency:                                        | 11             |
| 1.3.4 Possible Mismatches Involving Microlens on Fiber Tip:                     | 11             |
| 1.4 Outline of the Book:                                                        | 11             |
| 1.5 Chapter Summary:                                                            | 13             |
| Chapter 2: FPGA based Data Hiding: Preliminaries, Methods, Model, and Technique | , Tools,<br>14 |
|                                                                                 |                |
| 2.1 Introduction:                                                               | 14             |
| 2.2 Types of Watermarks:                                                        | 14             |
| 2.3 Properties of Digital Data Hiding for Quality Access Control:               | 15             |
| 2.3.1 Fidelity:                                                                 | 15             |
| 2.3.2 Robustness:                                                               | 15             |
|                                                                                 | 16             |
| 2.3.4 Security:                                                                 | 16             |
| 2.3.5 Computation Cost and Complexity:                                          | 16             |
| 2.4 The Choice of Working Domain:                                               | 16             |
| 2.4.1 Spatial Domain:                                                           | 10             |
| 2.4.2  DUT                                                                      | 1/             |
| 2.4.5 DW I-LITTING:                                                             | 1/             |
| 2.5 Embedding Mechanism:                                                        | 19             |
| 2.5.1 QIM Modulation Technique:                                                 | 19             |
| 2.6 I DSND and MSE.                                                             |                |
| 2.0.1 FONK allu MSE                                                             |                |
| 2.0.2 SSHVI aliu MISSHVI                                                        |                |
| 2.0.3 NCC                                                                       | 22             |
| 2.0.4 NLD.                                                                      |                |
| 2.7 1 Power Consumption:                                                        | 23             |
| 2.7.1 Tower Consumption                                                         | 23             |
|                                                                                 | 23             |

| 2.7.3 Area Power Frequency Tradeoff:                                 |          |
|----------------------------------------------------------------------|----------|
| 2.8 Various Architectural Design Strategy and Tools:                 | 24       |
| 2.9 FPGA & FPGA Design Flow:                                         |          |
| 2.10 Chapter Summary:                                                |          |
| Chanter 3: An Efficient Hardware Architecture for Quality Access Cor | ntrol of |
| Compressed Gray Scale Image on FPGA                                  |          |
| 3.1 Introduction:                                                    | 20       |
| 3.2 Passive Data-Hiding Based Quality Access Control Algorithm:      |          |
| 3.2.1 Image Encoding                                                 | 30       |
| 3.2.2 Image Decoding:                                                |          |
| 3.3 Proposed VLSI Architecture of Passive Data-Hiding Based Quality  | Access   |
| Control Algorithm [199, 200]:                                        |          |
| 3.3.1 RTL View:                                                      | 34       |
| 3.3.2 Data Format:                                                   | 34       |
| 3.3.3 Architecture of Encoder:                                       | 34       |
| 3.3.3.1 Image RAM:                                                   | 35       |
| 3.3.3.2 Pixel Buffer:                                                | 35       |
| 3.3.3.3 Quantizer and Encoder:                                       |          |
| 3.3.3.4 DCT/IDCT:                                                    | 37       |
| 3.3.3.5 Control Unit of Encoder:                                     | 38       |
| 3.3.4 Architecture of Decoder:                                       | 39       |
| 3.3.4.1 Decoder Control Unit:                                        |          |
| 3.4 Performance Evolution:                                           |          |
| 3.5 Chapter Summary:                                                 | 46       |
| Chapter 4: Efficient Hardware Implementation of Data Hiding Sche     | me for   |
| Quality Access Control of Grayscale Image Based on FPGA              | 48       |
| 4.1 Introduction:                                                    | 48       |
| 4.2 Access Control Scheme:                                           | 49       |
| 4.2.1 Watermark Encoding:                                            | 49       |
| 4.2.2 Watermark Decoding:                                            | 50       |
| 4.3 Proposed VLSI Architecture [203]:                                | 52       |
| 4.3.1 Data Format:                                                   | 53       |
| 4.3.2 Watermark Encoder Architecture:                                | 53       |
| 4.3.2.1 Image RAM Block:                                             | 54       |
| 4.3.2.2 Block Buffer:                                                | 55       |
| 4.3.2.3 DCT/IDCT Block:                                              | 55       |
| 4.3.2.4 Variance Calculation Block:                                  | 58       |
| 4.3.2.5 Dither Generation and Watermark Permutation Block:           | 59       |
| 4.3.2.6 Watermark Embedding Block:                                   | 60       |
| 4.3.2.7 Control Unit Block:                                          | 62       |
| 4.3.3 Watermark Decoder Architecture:                                | ~~       |
|                                                                      |          |

| <ul><li>4.4 Performance Evolution:</li><li>4.5 Chapter Summary:</li></ul>                                       |             |
|-----------------------------------------------------------------------------------------------------------------|-------------|
| Chapter 5: FPGA Implementation of Lifting Based Data Hiding Schem<br>Efficient Quality Access Control of Images | e for<br>73 |
| 5.1 Introduction:                                                                                               | 73          |
| 5.2 Access Control Scheme:                                                                                      | 74          |
| 5.2.1 Watermark Encoding:                                                                                       | 75          |
| 5.3 Proposed VLSI Architecture of Access Control Scheme [171]:                                                  | 79          |
| 5.3.1 Watermark Encoder Datapath:                                                                               | 79          |
| 5.3.1.1 Image RAM:                                                                                              | 79          |
| 5.3.1.2 DWT/IDWT Block:                                                                                         | 81          |
| 5.3.1.3 Dither Generation and Watermark Permutation Block:                                                      | 82          |
| 5.3.1.4 Embed Block:                                                                                            | 82          |
| 5.3.1.5 Encoder Control Unit Block:                                                                             | 83          |
| 5.3.2 Watermark Decoder Datapath:                                                                               | 85          |
| 5.3.2.1 Watermark Decoding Control Unit Block:                                                                  | 87          |
| 5.4 Performance Evolution:                                                                                      | 88          |
| 5.5 Chapter Summary:                                                                                            | 92          |

# Chapter 6: A Novel QIM Data Hiding Scheme and its Hardware Implementation using FPGA for Quality Access Control of Digital Image ... 93

| 6.1 Introduction:                                                  | 93      |
|--------------------------------------------------------------------|---------|
| 6.2 Proposed Access Control Scheme [64]:                           | 94      |
| 6.3 The VLSI Architecture of Proposed Access Control Scheme:       | 97      |
| 6.3.1 Data Format:                                                 | 97      |
| 6.3.2 Data Path of Access Control Encoder:                         | 97      |
| 6.3.3 Dither (d) and Watermark (w) Generator:                      | 98      |
| 6.3.4 Image RAM:                                                   | 99      |
| 6.3.5 Random Sequence Generation:                                  | 99      |
| 6.3.6 Control Unit:                                                | 101     |
| 6.3.7 Data Path of Access Control Decoder:                         | 101     |
| 6.3.8 Pipelined Architecture of Watermark Extraction & Noise Cance | lation: |
| -                                                                  | 103     |
| 6.3.9 Pipelined Decoder Control Unit:                              | 103     |
| 6.4 Performance Evolution and Discussion:                          | 104     |
| 6.4.1 Software Simulation:                                         | 104     |
| 6.4.2 Hardware Realization Using FPGA:                             | 108     |
| 6.5 Chapter Summary:                                               | 112     |

| Chapter 7: Mismatch Considerations in Laser Diode to Sing<br>Core Triangular Index Fiber Excitation via Upside<br>Hemispherical Microlens on the Fiber Tip | de-Mode Circular<br>Down Tapered<br>113 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|
| 7.1 Introduction:                                                                                                                                          |                                         |
| 7.2 Theory:                                                                                                                                                |                                         |
| 7.3 Results and Discussions:                                                                                                                               |                                         |
| 7.4 Chapter Summary:                                                                                                                                       |                                         |
| Chapter 8: Conclusion and Scope of Future Work                                                                                                             |                                         |
| 8.1 Conclusion:                                                                                                                                            |                                         |
| 8.2 The Scope of Future Work:                                                                                                                              |                                         |
| References                                                                                                                                                 |                                         |

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

# **Chapter 1**

# Introduction and Scope of the Book

## **1.1 Introduction:**

A very famous proverb "always trust your eyes" can be critically confronted in the world of the digital era and multimedia. The power of the old saying is partially accurate in the era of Internet Technology because of the capacity of the persistent and authoritative multimedia management tools. The effect of using the digital information system has unfastened a greater range for improvement and challenges. For this, consumers are using modern devices such as digital camera, camcorder, high-quality scanners, printers, digital voice recorder, multimedia personal digital assistant, and other devices for creating, manipulating, and enjoying multimedia data at the fullest. There is also a growth in the internet and wireless network worldwide offering an omnipresent medium to carry and exchange multimedia data. Besides, there are many organizations that have moved into E-Commerce and E-Business for initiating an online transaction. There are mainly two objectives of E-Business. First the manufacturers of digital media place a great number of valuable products in their website for wide publicity and popularity among the consumers, and secondly, they also want to do quality access control for the general users for their prosperous commercial aspects.

The security and fair use of the online portal, as well as fast delivery of multimedia products to a variety of end users or devices with guaranteed quality of service (QoS), is important but it faces many challenges. A typical application like access control may be seen as common fact in a future generation multimedia network like video-on-demand and real-time video multicast system where billing and royalty would be controlled by the fulfillment of a degree in QoS. In the present work, the quality of access control is focussed, using data hiding method [1-17]. The objective of quality access control is a negotiation between full denial and complete acceptance. It can be used for conveying secret data for controlling access to different relative qualities. It is expected that signal distortions caused due to data hiding can be turned back by the certified user to enjoy the full quality. Manipulation in the image is generally directed by the content of the original image. That the digital image can be used illegally, tampered, and copied for illegal purposes has become a very serious issue.

These findings have ultimately reduced the unique nature of multimedia signals, like photos, video clips, audio clips, which seemed to be accurate earlier. Therefore, the problems like protection of digital data from unlawful replication and management, recognizing original ownership, safety and fair use of multimedia information have become very serious issues to consider.

Data verification technique is considered to be an effective method to go on with the content reliability and security of multimedia information [18-20]. This methodology is dependent on setting securely a supplementary message surrounded by the primary data which suits few essential desires.

The original data can be analog or digital. But in the present book, we confined our discussion on the access control application based on data hiding principle in digital media. Reports on significant development on data hiding based access control have already been published recently from industry as well as the academic world [21-26]. A large attention of research has been focussed on this new field of research because of its broad perspective of usages such as copyright safeguard, data verification, content reliability, certification, along with the new upcoming applications such as medical imaging, fingerprinting & data indexing and quality access control in the multimedia application. It is also unnecessary to point out that all these applications do not possess the same requirements and algorithmic design. Side by side most of them are realized in software by simulating in MATLAB that leads the schemes complex and time-consuming. Sometimes, quality access control demands real-time hardware implementation to achieve low-power consumption, highspeed, and real-time processing with greater reliability and at the same time, the scheme should be fitted with the existing consumer electronic devices. The hardware implementation has several advantageous over software simulation such as less execution time, low power consumption, and an optimized area. Moreover, H/W implementation also provides reliable performance, while configured in various consumer electronic gadgets [27-30]. So there is a requirement to build up hardware for specific data hiding algorithm to suit the particular application for consumable electronic application.

## **1.2 Overview of Data Hiding in Digital Media:**

In this segment, the concept of digital data hiding, various technologies along with the types are brought in. In the digital era, the digital media having been circulated on the network widely has become very popular with the rapid growth of the World Wide Web. But the risk is always there as the vendors who sell the product online sometimes are reluctant to share data i.e. image over the internet, for it can easily be reproduced and carbon copied without the proper consent. Along with this, the risk of transmission error is there while transmitting the digital data over the radio mobile. Though commercial interest is associated with profit, the vendors always want to secure their ownership rights on the digital network. Here comes the urgency of a new secured technique.

An unnoticeable embedding of secondary information in the multimedia signals is dealt with by one of such well-accepted solutions. Although the model of such data hiding has been established to survive the test of time (for more details of the history of information hiding, the reader is referred to [19, 31-36]), the current digital data hiding has a short history since 1993. Along with the different functions of data entrenchment in digital multimedia sources, connected activities like steganography, digital watermarking, and data hiding have also appeared. Stenography is an art and science of data hiding. It makes safe communication of data by hiding the same in original data. The original data is known as the cover, the host, or the carrier. Thus it is likely to have cover-image and stego-image, cover-audio and stego-audio, cover-video and stego-video etc. Ideally, the stego-object is identical to the original data c, appearing as if no other information has been encoded. Watermarking is very comparable to steganography in a number of respects. Both seek to implant information inside a cover signal with little to no degradation of the cover-object. An ideal steganographic system would embed a huge amount of information, ideal safety with no noticeable deprivation to the cover object.

#### **1.2.1 Watermarking Principles:**

Basically image watermarking process has three distinct stages [19, 33, 37, and 38] namely watermark Embedding, Distortion/Attack, and Watermark Detection. Figure 1.1 shows the different stages of digital image watermarking.



Figure 1.1: Stages of digital image watermarking.

Firstly, a watermark (W) is embedded in the host image by using different embedding algorithm and the secret key. The basic mathematical form of embedding process can be depicted as:

$$I'_H = I_H + (\alpha \times W) \tag{1.1}$$

Where,

 $I_H$  = Host image

 $I'_{H}$  = watermarked image

W = watermark

 $\propto =$  scaling factor

The scaling factor ( $\alpha$ ) can be used with an optional public or secret key (k). The watermarked image  $(I'_H)$  is the output of the watermark embedding block [38]. The next stage of the digital image watermarking is to transmit the watermarked image  $(I'_H)$  over the network. The extraction of the watermark process may be non-blind or blind. The non-blind process of watermark extraction needs the original image to compare and then watermark can be extracted.

Mathematically, the watermark extraction process can be expressed as:

$$W = (I'_H - I_H) / \alpha \tag{1.2}$$

The blind watermarking scheme is most challenging to recover the watermark. The watermark decoding does not require original host image  $(I_H)$  and embedded watermark (W). The implementation of the scheme is most challenging as it do not requires the host (original data), or the watermark (W). These systems extract n bits of the watermark data from the watermarked data (i.e. the watermarked image).

## **1.2.2 Watermarking Applications:**

Digital watermarking is applicable to a large type of digital documents [11-14, 21, 22, 39-49]. The watermarking is frequently used for the image, video, and audio files. Digital video footage also can be watermarked. Various text documents like scientific papers, medical report files, legal papers, military information also can be suitable for watermarking. There are mainly two types of watermarking invisible and visible. These two has different applications depends on the application domain. Common applications of digital watermarking comprise of ownership verification and detection, confirmation, fingerprinting, leveling, duplicity control, broadcast supervising, and quality measurement of the services for multimedia signals in mobile radio domain. Some of these applications are briefly discussed here:

- **a.** *Broadcast Monitoring:* Advertisers wish to guarantee that they are given all of the airtime which they buy from broadcasters. Thus, there must be an auto-recognition system, which can store the recognition codes to the broadcast. Watermarking is obviously an appropriate technique for information monitoring. The Watermark presents within the content and is attuned with the installed broadcast equipment base. Even though, embedding of the identification code is extremely complex compared to the cryptography method where the code is loaded in the file header. Furthermore, it also influences the visual quality of the system. Still, lots of companies secure their broadcasts through watermarking techniques.
- **b.** *Content Authentication:* The process of validating the reliability of the watermarked object is to certify that the object is not being tampered with. The name authentication has a wide range of meaning. Sometimes many cultural organizations want to certify the authenticity of the creation they have because they comprise both the ownership rights and the expert's opinions by watermarking. Use of digital watermarking process comprises of reliable cameras, video supervision and remote sensing applications, digital insurance claim identification, journalistic photography, and digital rights management systems.
- **c.** *Tamper Detection:* The digital object/data can be tampered with effortlessly by utilizing many computer resources. The watermarking is the way out for this tamper detection, where the validation mark (watermark) cannot reside with the digital work after least modification.
- **d.** *Ownership Proof and Identification:* A legal owner can recover the watermark from digital content to prove his ownership. There are restrictions with textual patent notices, as they can be removed easily. Copyright notice which is printed on the physical document cannot be copied with the digital data. Though, probably that text copyright can be located in an insignificant position of the document to make them unremarkable. Inseparability and imperceptibility of watermarking is the better solution compared to the visible text mark for owner recognition.

The watermark is not only employed for detection of the copyright ownership but also for confirming the ownership of the document.

- e. *Fingerprinting:* Fingerprinting is often called Transaction tracking, where each of the digital work copy is exclusively recognized, like the fingerprint which identifies a person. For each legal work, the watermark might record the recipient. Visible watermarking is suitable for tracking the transaction but invisible watermarking is much useful. For instance, in case of movie making, every day's videos (also called dailies) are circulated to the persons, concerned with the movie. Sometimes, the videos are not opened to the press, so the studios adopt a visible text on screen's corner. This helps the identification of the copy of dailies.
- **f.** Secret Communications: In some situations, during the transmission of cryptographic messages unwanted attentions are drawn. The cryptographic technology utilization may be forbidden or restricted by law. However, the steganography technique does not promote secret communication and therefore avoids inspection of the sender, message, and recipient. Secret and sensitive information can be transmitted without alerting potential attackers or eavesdroppers by watermarking.
- **g.** *Error Concealment:* The multimedia signals transmission, (example: images over radio mobile channels) may cause predictable packet loss that might degrade the received signal's visual quality. Watermarking in the form of assimilates of the host data can be used to add redundancy in the digital media and it is used at the decoder to reduce visual degradation through post-processing.
- **h.** *Access Control:* In recent times watermarking finds a potential application for access control of digital media. The wide use of the Internet in publicity and e-commerce motivates this approach. Vendors and the manufacturers have always two dissimilar objectives. They required putting their large quantity of precious works on the website for broad publicity and at the similar time, they desire to confine full quality access to the consumers in order to preserve their commercial profit. This has fashioned a vital requirement to the manufacturers and the vendors to develop a quality access control scheme, which permits all receivers of the broadcast channels to show a low-quality image with small or no commercial value. Watermarking can become useful as an access control tool either to refuse full accessing or to permit partial accessing digital content.

## **1.2.3 Domain of Implementation:**

Digital data hiding and its application development began after the investigations into the design of the algorithms and their usefulness. The data hiding process (embedding and decoding) and their performance evolution through various measures are tested through the software implementation. Software implementation runs on a [50, 51].

Furthermore, the hardware-based implementation is one where the functionality of an algorithm is entirely realized in the form of custom-designed circuitry.

The benefit of hardware implementation over the software implementation is that the hardware consumes less area and less power. Although the software implementation of an algorithm may take very less time to realize, there are several fascinating reasons to incorporate the hardware implementation.

Especially, a hardware solution is more economical for consumable electronics devices because adding the data-hiding component takes up a tiny, dedicated area of silicon. Conversely, Software implementation necessitates the utilization of a dedicated processor such as a DSP core that occupies more area, consumes significantly more power, and may still not perform adequately fast. Thus there is a tremendous need found to implement an algorithm into practical products.

#### **1.2.4 Review of Prior Art:**

Data Hiding is an essential tool for digital information security, integrity, and confidentiality. Researchers always give huge effort to invent an appropriate watermarking algorithm to achieve the goal. Now the researchers deal with improvement of the various factors associated with watermarking. Gonzalez et al. [52] propose a discrete Fourier transform-rational dither modulation (DFT-RDM) algorithm that is found to be robust against LTI filtering. The author clamed and validates the performance gains of their scheme is much larger than those achieved with regular DM. Khan et al. [53] presented a histogram processing based reversible watermarking scheme.

The scheme has performed down sampling to achieve high capacity. Chen et al. [54] proposed a reversible data hiding method which increases payload capacity by maintaining image quality. Sun et al. [55] also worked on the reversible algorithm based on 'joint neighbor coding' technique for 'block truncation coding' (BTC)-compressed images. Kim et al. [56] proposed a lossless data hiding method using absolute moment block truncation coding (AMBTC) that has low complexity, high embedding capacity, and good perceptual quality. But the scheme suffers from the robustness problem against various intentional and non-intentional attacks. Karri et al. [57] proposed an algorithm that follows symmetric encryption and decryption technique and also develops concurrent error detection architectures that investigate the tradeoffs among area overhead, performance cost, and fault finding. Weng et al. [58] proposed a data-hiding scheme based on histogram shifting method in integer Haar wavelet transform (IHWT) domain. Parah et al. [59] proposed a blind watermarking technique in the DCT domain. Lo et al. [60] proposed a reversible data hiding scheme for the compressed images. The scheme embeds the secret data using histogram shifting for the BTC-compressed images. Joshi et al. [61] planned a reversible watermarking scheme to defend database of fingerprint reformation system and the sensor. Younes Terchi & Saad Bougueze [62] proposed an efficient watermarking scheme in 'parametricQIM' domain for robustness improvement. Dogan [63] developed a high capacity data hiding scheme based on' quantization index' and graph neighborhood degree.

All the schemes reported above are inattentive on the access control of the image. The urges of access control have been received wide attention and numerous solutions have been found in the literature [6, 64]. The digital image access control can be implemented in the various domain (spatial or transform).

In spatial domain access control, the image pixels are directly modified on the other hand in the transform domain (Fourier transform (FT), short time Fourier transform (STFT), discrete cosine transform (DCT), continuous wavelet transform (CWT), discrete wavelet transform (DWT) etc.) the coefficients are modified for the said process.

#### Introduction and Scope of the Book

Among all of the transform tools, DCT and DWT are very popular because common image compression standards like JPEG and JPEG-2000 utilize DCT and DWT, respectively. Spatial domain techniques are very uncomplicated, uncomplicated and easy to implement with high payload [65] but those schemes are normally less robust against various attacks. Transform domain techniques become further popular as they have extra robustness against attacks (filtering, compression) and also compatibility with a range of image compression standards. Chang et al. [6] combine data hiding and encryption technique to perform access control in the spatial domain for scalable media. Yanyan Xu ET. al. [66] presented a scheme to protect multimedia information by incorporating encryption and digital fingerprinting in JPEG compressed domain. Their scheme improves security, efficiency, imperceptibility, and collusion resistance. Phadikar et al. [21] proposed a method to serve the dual objectives of error concealment and quality access control of digital images based on data hiding, forward error correction (FEC) code and cryptography. The scheme shows that the use of FEC like convolution code improves the robustness against bit error occurred during the transmission over the noisy channel. The scheme is simple, easy to implement and also secured, in the sense that the user having the correct key can extract the hidden data and can also conceal the damaged regions. Phadikar et al. [64] utilize a new model of quantization index modulation (QIM) technique in the special domain for quality access control of greyscale images. Authors embed permuted binary watermark over N-mutually orthogonal signal points without complete suppression of self-noise. The watermark is correctly detected from the weighted average of N-decision statistics to access superior quality image. Deepa ET. al. [67] proposed a lapped biorthogonal transform (LBT) based low-complexity zerotree codec (LZC) for image coding algorithm to achieve high compression with low computational complexity. Grosbois et al. [1] implemented a compressed DWT domain access control technique to control access to different resolutions and qualities of the image. The scheme was developed to fit the criterion of JPEG-2000 codec. The literature survey of previous works shows the superiority of the wavelets that allow efficient quality control of images and multimedia signals efficiently. This is due to the multi-resolution aspect of wavelets transform. Conversely, the said property of the wavelets involves huge computational overhead despite the length of the wavelet filter coefficients. On the other hand, it is claimed that computational cost to be less on DCT based implementation than conventional DWT based implementation. Moreover, it is found that more than 80% of the digital content (image and video) are still in DCT compressed form. So, access control in the said domain is an important research objective. Phadikar et al. [11, 13] developed an access control scheme for both colour and grayscale image in DCT compressed domain. Ferretti ET. al. [68] also developed a lossless 2-D discrete wavelet transform architecture using systolic process to serve the image processing operation.

The brief reviews of the previous works show that the majority of the quality access control schemes are very complex, consuming huge time and are only tested and implemented in MATLAB. It is extremely hard to employ them in a real-time environment. The real-time reconfigurable hardware implementations for the complex algorithm are difficult due to huge resource utilization and power consumption. Although the reconfigurable hardware implementation has numerous benefits over software realization such as cost-effective, less design time, field programmability, fast response time, moderate processing speed, and optimized power consumption. Over the past decade, numerous efforts have been made to develop a hardware system for security and copyright protection of image signals and video signals [54, 61, 69, and 70].

Hardware implementation merits are characterized on the parameters like operating speed, power consumption, and compactness of the design. Hardware designs may be synthesized Ed in the platforms like 'custom Integrated circuit' (IC) [69], tri-media processor board [69], or (FPGA) [71-98]. 'Very-large-scale integration' (VLSI) hardware designs of different conventional watermarking schemes are reported in the literature for real-time security and copyright protection of image or video signals. Maity & Kundu [71] proposed VLSI architecture of distortion-free covert image-in-image communication in FPGA board.

This architecture permits a data rate of 4.706Mbits/s at a clock frequency of 80 MHz. Mohanty et al. [73] introduced a high-performance based low-power hardware implementation in real time for both robust and fragile watermarking technique. But, the design employs a huge amount of resource for FPGA. Furthermore due to the serial processing of pixels the efficiency of the scheme is low. Karthigaikumar and Baskaran [74] developed a hardware architecture for a binary image watermarking method. The scheme is implemented using Virtex-E series FPGA (xcv50e-8-cs144) and an application-specific integrated circuit (ASIC). The greyscale image of size (256×256) is processed by two ways parallel processing with FPGA having 788 lookup tables (LUTs), 457 slice registers, and 279 slice flip-flops.

The ASIC implementation of the above scheme, with 0.35um technology, requires only 404.15 mW powers when operating in 100MHz frequency. Maity & Kundu [75] proposed a real-time implementation of the watermarking scheme in 'fast Walsh transform' (FWT) domain using 'spread spectrum' modulation. The design utilized parallel processing to improve the operating speed. However, the resource utilization almost exceeds the existing resource of FPGA for the implementation of (8×8) image. The maximum resource utilization increases the static power consumption which is undesirable. Maity et al. [76] described a modified 'reversible contrast mapping' (RCM) algorithm using FPGA.

The scheme attains data rate of 1.0395 Mbps at an operating frequency of 98.76 MHz. A similar type of work is done in [77] using the modified RCM algorithm that achieves a data rate of 1.0493Mbps, at an operating frequency of 95.3 MHz. Kaddachi et al. [72] developed an encoder system on FPGA for wireless sensor camera network. The schemes also have tested in ASIC. The FPGA based implementation needs only 2632 logic cells to attain a pixel rate of 60.77 Mpixels per second with an operating frequency of 60.77 MHz. Mohanty et al. [78] proposed a watermarking algorithm in DCT domain. At the same time they have designed and implemented a reconfigurable hardware for real-time digital right management on an Altera® Cyclone-II FPGA. The prototype system achieves maximum throughput of 43 frames/s at a clock speed of 100 MHz. Darji et al. [79] proposed hardware architecture of the watermarking scheme in wavelet domain based on 'quantization'. The architectural implementation with and without pipeline is synthesized using Xilinx's integrated synthesis environment (ISE) for FPGA. Authors also implement the algorithm in a semi-custom integrated chip using UMC 0.18µm technology standard cell library. The above-pipelined implementation of watermarking encoder requires  $0.027 \text{ mm}^2$  area and consumes 0.074 mW power. Krill et al. [80] Adopted a dynamic partial reconfiguration (DPR) design flow for image and signal processing operation to process three compression such as colour space conversion (CSC), two-dimensional biorthogonal discrete wavelet transform (2-D DBWT) and three-dimensional Haar wavelet transform (3-D HWT) together in three intellectual property (IP) cores.

The FPGA implementation result shows that the DPR environment allows the efficiently optimized area/speed ratios along with the bit stream size and dynamic power consumption. The scheme has great opportunity to use in designing hardware for image access control implementation. Das et al. [81] proposed a VLSI architecture to implement 'difference expansion' (DE) based watermarking scheme for an image block of  $(8 \times 8)$ . Authors demand that their scheme consumes 150mW power on an average while operating at a frequency of 150MHz and also achieves throughput of 35.284 Mbps. Hazra et al. [82] implemented a reversible watermarking algorithm by histogram-bin-shifting (HBS) in FPGA. Their implementation of watermarking encoder and decoder module operates at a maximum frequency of 445.330 MHz and 201.824 MHz and consumes 1.215 W and 0.104 W of power respectively. Roy et al. [83] have implemented a DCT based digital watermarking system to embed watermark into compressed video signal. They have improved their system performance by employing pipeline structure and uses parallelism. The system embed watermark in 1.6ms/frame at an operating clock frequency of 40 MHz and consumes 270mW of power.

## **1.3 Overview of Coupling of Laser Diode to Single Mode Fiber:**

The FPGA based hardware architecture is being extensively employed in the field of VLSI technology. Side by side, the optical fiber has also emerged as a potential candidate in the field of communication owing to it large bandwidth (~GHz) and low loss (~0.15dB/Km). Thus analysis of optical circuitry has attracted global interest recently. In this context, one can be motivated to study the launch optics involving laser diode to fiber coupling.

## **1.3.1 Background:**

Optical fibers are being considered presently as the most prospective medium of optical communication. The concerned operating wavelength ranges from  $1.3\mu$ m to  $1.6\mu$ m since the fiber material being usually silica, the minimum attenuation loss (~0.2 dB km-1) occurs at the wavelength  $1.55 \mu$ m while the zero material dispersion is obtained at the wavelength  $1.3 \mu$ m. It deserves mentioning in this connection that attenuation and dispersion are two important parameters in this context of guidance of optical signal through the fiber. Further, this dispersion contains three components namely waveguide dispersion, material dispersion, and composite profile dispersion.

The composite profile dispersion, which is proportional to the derivative of core-cladding refractive index difference with respect to wavelength is very small (~0.5 ps km-1 nm-1) and as such it is negligible for all practical purposes [99]. The waveguide dispersion arises due to the dependence of the propagation constant on wavelength and thus this component is absent in single-mode fiber.

Thus, optical communication through the single-mode fiber with operating wavelength ranging between  $1.3 \ \mu m$   $1.6 \ \mu m$  has emerged as the preferred practice nowadays. A fiber involves different types of losses namely absorptive loss, transmission losses in the form of bending, micro bending and splice losses. Apart from different kinds of losses mentioned above, one important source of loss is the coupling loss concerned with a laser diode to fiber coupling.

In the case of butt-coupling, the efficiency is poor being in the range between 10% and 15% [100] only. It is relevant to mention in this connection that the small spot size of the laser beam emitted by laser diode has to match with the large spot size of the fiber. This sort of matching requires the use of lens intermediate between fiber and laser. Thus, in the field of coupling optics, various lensing schemes along with fibers of different kinds of refractive index profiles in the context of optimum launch optics are being added to literature continuously [101-124].

In this respect, it has been reported that microlenses are more efficient in comparison with intermediate lenses. We detail the performance of different lensing schemes in the following subsection.

## **1.3.2 Various Lensing Schemes:**

The coupling between the laser diode source and the single mode fiber (SMF) is required to be highly efficient in the field of launch optics. Literature survey leads us to conclude that [109, 117, 122, 124-130] different kinds of microlenses, as well as tapered microlenses on the fiber tip, increase the excitation efficiency enormously. The techniques of fabrication of lenses are mainly of two kinds such as intrinsic monolithic microlens techniques [101, 104, 131-141] and external bulk lens techniques [142] involving disjoint lenses like a green lens and ball lens [143-148]. The objective in this context is to obtain maximum coupling efficiency by increasing the smaller spot size of the laser beam (~1 $\mu$ m) so as to match with the larger spot size (~5  $\mu$ m) of the fiber maximum coupling efficiency. This is achieved by use of suitable lensing system. Thus, the knowledge of both lens transformed laser beam spot size and fiber spot size is essential in the field of launch optics. Basically, matching of laser spot size and fiber spot size by suitable lens arrangement provides a collection of more laser light by the fiber.

A microlens on tip of the fiber will be ideal for coupling if its aperture is large enough so as to receive the entire incident radiation, it transforms the laser spot size so as to be equal to fiber spot size and the spherical aberration is absent in it [138, 140].

A hyperbolic microlens on the tip of single mode step index fiber can satisfy all the stated conditions and that is why it can produce theoretical efficiency around 100% at a particular focal length of hyperbolic microlens [138]. Side by side, hemispherical microlens on the fiber tip provides moderate coupling efficiency due to its restricted aperture, spot size mismatch and spherical aberration [140]. But it experiences wide practical uses as its fabrication on the fiber tip requires the application of simple photolithographic technique. On the other hand, the fabrication of hyperbolic microlens on the tip of the fiber can be implemented only by sophisticated and costly laser micromachining technique.

Another lensing scheme which has also established its importance in the field of lunch optics is upside down tapered lens. Its fabrication on the fiber tip requires the use of carbon dioxide laser. Such lensing scheme has generated tremendous contemporary interest. Thus in the field of coupling optics, different kinds of upside-down tapered lenses on the fiber tip in the context of maximum coupling optics are being reported continuously from different research laboratories [121-124, 129, 130, 149, 150].

Further, as far as the evaluation of laser diode to fiber via microlens on the fiber tip is concerned, the formulation of simple but accurate method for evaluation of coupling optics involving different kinds of microlenses on the tip of the fiber should be an emerging topic in the field of current research domain in this context, the mathematical methodology must be user friendly but at the same time be accurate.

## **1.3.3 Evaluation of Coupling Efficiency:**

Evaluation of coupling optics in respect of laser diode to fiber coupling via microlens on the fiber tip was based on the phase matching technique [104, 140]. It involved cumbersome numerical integration which required long computational time. This formulations of a simple but accurate method for prediction of the said coupling optics emerged as a potential topic of research. Accordingly, transformation matrices popularly known as ABCD matrices of different types of microlenses on the fiber tip were prescribed and these were used in prediction of the concerned coupling optics. It is relevant to mention in this connection that literature has been already enriched by formulation of ABCD matrices appropriate for hemispherical microlenses, hyperbolic microlenses, tapered microlenses, parabolic microlenses on fiber tips and application of those in predicting the coupling optics elegantly in simple but accurate formalism [104-106, 109, 116, and 127, 138-140, 151-153]. The evolution of coupling optics in each type of microlens is extremely simple, yet the results found to agree exactly with those found by rigorous numerical integrations. Still, different types of microlenses on of fiber of different refractive index profiles are being proposed with a view to maximizing source to fiber coupling efficiency. These, prescription of ABCD matrices for those and application of those in respect of estimation of coupling optics is being worked out in different contributions and reported [119, 149]. In fact, literature demands cost-effective but efficient new lensing projects clubbed with the formulation of appropriate ABCD matrix for prediction of concerned coupling optics inaccurate but user-friendly fashion.

## **1.3.4** Possible Mismatches Involving Microlens on Fiber Tip:

As far as fabrication of microlens on fiber tip is concerned, there is the possibility of transverse and angular mismatches only. The question of longitudinal mismatch does not arise. Further, it deserves surfacing in this connection that designers do not exceed transverse and angular mismatches beyond  $2\mu$ m and 20 respectively. In this context, it is judicious to apply ABCD matrix formalism for the relevant lensing system in order to estimate the coupling losses in presence of possible transverse and angular mismatches in case of lunch optics [106, 115, 117, 123, 130, 149, 154, 155]. This study will access the sensitivity of the coupler with respect to the said two kinds of misalignments. The concerned results will be of immense importance in optimum lunch optics.

## **1.4 Outline of the Book:**

The book outlines the implementation of various data hiding based quality access control hardware of digital image. The algorithmic study, the corresponding hardware design, and the outcome of the investigation are summarized in the following chapters are summarized below:

The second chapter, we have discussed the fundamental issues concerned with data hiding for the access control of the digital image.

Chapter 2 gradually elaborates the properties of digital data hiding, embedding mechanism in various implementation domains, the qualitative measurement of different data hiding properties, measurement of different properties used in VLSI design. Lastly, a brief overview of different architectural implementation strategy along with the FPGA based tools and techniques are presented in this chapter.

The Chapter 3, "FPGA based low power hardware for quality access control of compressed grayscale image", describes a novel hardware architecture of passive data-hiding scheme for quality access control of digital image in DCT compressed domain. The serial and parallel hardware architecture is implemented to achieve low power consumption and high throughput.

The Chapter 4, "Parallel hardware implementation of data hiding scheme for quality access control of grayscale image based on FPGA", explains VLSI architecture design and implementation of DCT domain based quality access control scheme. The VLSI architecture is optimized by parallel processing to improve the throughput and is implemented in field programmable gate array (FPGA). The Chapter 5, "FPGA implementation of lifting based data hiding scheme for efficient quality access control of images", describes the VLSI architecture design and implementation of a data-hiding technique for efficient quality access control of images using lifting based DWT. The significant feature of this architecture is the low power design using advanced VLSI techniques.

Chapter 6, "a novel QIM data hiding scheme and its hardware implementation using FPGA for quality access control of digital image", describes a spatial domain data-hiding scheme for quality access control of digital images using QIM and its hardware implementation. In this approach, an encoded binary message is embedded over N-mutually orthogonal signal points using QIM technique but without complete self-noise suppression.

At the decoder side, watermark bits are extracted using minimum distance decoding. Selfnoise is then suppressed by the authorized user to provide the better quality of the image. Moreover, the serial and pipelined hardware architecture is implemented to minimize the power consumption and at the same time to improve the throughput.

Chapter 7, "Mismatch considerations in laser diode to single-mode circular core triangular index fiber excitation via upside down tapered hemispherical microlens on the fiber tip", describes the relevant ABCD matrix for upside-down tapered hemispherical microlens on the tip of triangular index fiber has been employed for study of coupling efficiency of a laser diode to a single-mode circular core triangular index fiber coupling via upside-down tapered hemispherical microlens on the tip of the fiber study of the fiber coupling via upside-down tapered hemispherical microlens on the tip of the fiber in presence of possible transverse and angular offsets.

Chapter 8, "Conclusion and scope of future work", summarized the contribution of this book along with the future scope of the work.

Introduction and Scope of the Book

## **1.5 Chapter Summary:**

In this chapter, we present the basics principle of data hiding, the application of data hiding in digital multimedia along with the brief literature review. Moreover, we have discussed the coupling of the laser diode to the single mode fiber. The last section of this chapter provides an outline of the book. Next chapter we will provide more information regarding the preliminaries, methods, model, tools, and technique. FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

# Chapter 2

# FPGA based Data Hiding: Preliminaries, Methods, Model, Tools, and Technique

## **2.1 Introduction:**

The previous chapter introduced an overview of data hiding in digital media along with its various applications, review of prior art and the scope of the book. Apart from the basic discussion of digital data hiding principle, the implementation process needs several requisite to serve a specific design goal.

This chapter discusses the FPGA based Data Hiding: Preliminaries, Methods, Model, Tools, and Technique. The chapter is organized as follows: Section 2.2 discuss regarding the types of watermarks, Properties of digital data hiding for quality access control is discussed in Section 2.3.

The choice of working domain is discussed in Section 2.4. Section 2.5, 2.6 and 2.7 explains the embedding mechanism, measurement of different watermarking properties and the measurement of different properties in VLSI design respectively. Various Architectural Design Strategies for hardware implementation is introduced in Section 2.8 Next Section 2.9 briefly describes the FPGA and the FPGA design flow. Last Section 2.10 concludes with the chapter summary.

## 2.2 Types of Watermarks:

Watermark and watermarking techniques are broad categories in various ways. Depending upon the host image data watermarking is classified as text-document watermarking, image watermarking, audio watermarking, and video watermarking.

According to the human perception watermarking is of two types i.e. invisible and visible watermarking. On the other hand, invisible watermarking may be robust or fragile. Digital watermarking can also be classified as source based and destination based.

According to the domain of working the watermarking process is subdivided into the special domain and frequency/transform domain. The special domain is simple and easy to implement but the robustness of watermarking is poor.

Whereas the frequency or transform domain watermarking scheme is more robust but they are very complex to implement. The detail classification of watermarking [156] is shown in Figure 2.1.



FPGA based Data Hiding: Preliminaries, Methods, Model, Tools, and Technique

Figure 2.1: classification of digital watermarking

## 2.3 Properties of Digital Data Hiding for Quality Access Control:

The properties of quality access control through digital watermarking technique [157-160] are characterized by the following requirements:

## 2.3.1 Fidelity:

The most significant properties in the watermarking system are fidelity. Fidelity of a watermark is also known as imperceptibility. It is the perceptual similarity between the original host image and the watermarked image. The introduction of the watermark in a host image may abuse or demolish the commercial value of the host. However, it is challenging to maintain the fidelity of the host image by keeping the robustness and the capacity in a tolerable limit [161, 162].

## 2.3.2 Robustness:

The robustness of a watermarking system is to pretend the watermark values in the hostile environment created by the intentional attacker with the different image processing operation. In general, the embedded watermark can be removed with the exact knowledge of the embedding process.

However, any unauthorized user having the partial knowledge of embedding, attempting to devastate the watermark should possess a remarkable degradation in host image quality before the watermark is lost. There are lots of signal processing operations like re-sampling, re-quantization, lossy compression, linear-nonlinear filtering, geometric distortions etc. that damage the watermarked host and they may also damage the watermark.

# 2.3.3 Capacity:

Embedding capacity or payload is the number of extra data bits are embedded into the host image. The embedding capacity of a host depends on the different embedding process designed for various applications. The number of extra bits must be large enough to improve the security of the host but at the same time, satisfactory imperceptibility is also to be taken care of [161, 162].

# 2.3.4 Security:

Security of a watermarking system is ensured by incorporating cryptographically secured keys. The authentic user having the knowledge of key can recover the watermark and extract from the host to enjoy the superior quality of the decoded host.

# 2.3.5 Computation Cost and Complexity:

Computation Cost and Complexity of the watermarking system is defined by the amount of time taken by watermark encoding and decoding process. The growing computational speed of modern computer needs more computational difficulty in order to provide strong security and validity of the watermark.

# 2.4 The Choice of Working Domain:

The watermark in the host image can be directly embedded or can be embedded in the transform domain. The direct method of embedding raw data into the original host image is known as spatial domain technique. The spatial domain technique is advantageous because it is simple and easy to implement. Compare to spatial-domain technique the frequency domain/transform domain technique are more commonly use.

In this technique, the watermark is inserted in the spectral coefficients of the image. It is noted that the characteristics of the human visual system (HVS) are better captured by the spectral coefficients which lead the use of frequency domain watermarking technique [163].

There are different frequency domain/transform domain techniques such as DCT, DFT, and DWT and so on. Among them, DCT and DWT become more popular due to their block coding in nature. Block-based transformation is commonly used in Joint Photographic Experts Group i.e. JPEG (image) and Moving Picture Experts Group i.e. MPEG (video) compression standard. In this section, we will study DCT and DWT with the lifting scheme.

## 2.4.1 Spatial Domain:

The simplest way of watermarking is the spatial domain method. In this method, the watermark bits directly appended to the host pixels. However, most of the spatial domain scheme is not robust against different image processing attack but still, due to the less computational cost, the system designers are attracted to implement a low-cost and lower power hardware.

FPGA based Data Hiding: Preliminaries, Methods, Model, Tools, and Technique

## 2.4.2 DCT:

DCT is widely used in digital image watermarking since DCT plays various important roles in image watermarking like strongly robust, Watson visual model is based on DCT and most importantly the watermarking algorithms based on DCT are compatible with the existing international compression standards [164, 165]. The main idea of these methods is to select middle or low-frequency coefficients to embed a watermark bit in the DCT transform domain. The two-dimensional DCT of an image x (m, n) of size, (M×N) can be expressed as follows:

$$X(k,l) = \frac{2}{\sqrt{MN}} c(k)c(l) \sum_{M=0}^{M-1} \sum_{N=0}^{N-1} x(m,n) \times \cos\left[\frac{(2m+1)k\pi}{2M}\right] \cos\left[\frac{(2m+1)kl}{2M}\right]$$
(2.1)  
$$k = 0,1,2,\dots,M-1; l = 0,1,2,\dots,N-1$$

The IDCT is:

$$x(m,n) = \frac{2}{\sqrt{MN}} \sum_{k=0}^{M-1} \sum_{l=0}^{N-1} C(k) C(l) X(k,l) \times \cos\left[\frac{(2m+1)k\pi}{2M}\right] \cos\left[\frac{(2n+1)l\pi}{2N}\right]$$
(2.2)  
$$m = 0, 1, 2, \dots, M-1; \quad n = 0, 1, 2, \dots, N-1$$

Where,

$$c(k) = \begin{cases} 1/\sqrt{2}, & k = 0\\ 1, k = 1, 2, \dots, M - 1 \end{cases}$$
$$c(l) = \begin{cases} 1/\sqrt{2}, & l = 0\\ 1, l = 1, 2, \dots, N - 1 \end{cases}$$

The hardware implementation of the DCT algorithm involves special interest as because the algorithm uses huge computational resources like multiplications, additions, and huge memory. Several algorithms have been proposed over the last couple of decades to reduce the number of computations and also to achieve the higher data rate. In order to achieve higher data rate and minimum power consumption, the distributed arithmetic based technique is used to design multiplier less DCT structure [166].

## 2.4.3 DWT-Lifting:

One of the notable characteristics of the JPEG2000 standard, producing it the resolution scalability [167] is the advantage of the two-dimensional 2D-DWT to transform the image representations in a more compressed form. It is acknowledged as the essential difference among JPEG and JPEG2000 standards. Considering there does not have any need to split the input image into non-overlying 2-D blocks and its fundamental purposes produce variable length, wavelet-coding systems at fabulous compression ratios bypass blocking artifacts.

There are two major approaches exist for the implementation of DWT such as convolutionbased approach [168] and the lifting-based approach [169, 170]. The convolution approach of 1-D DWT needs a massive amount of computations along with a huge storage element to store the coefficients.

Such characteristics are not an excellent way to implement an efficient high-speed, lowpower image processing applications. Side by side the lifting scheme needs lesser computations than the conventional DWT. The lifting scheme break-up the chain of the wavelet filter-bank (high pass filter & low pass filter) into a chain of smaller filters which reduces the computational complexity by half [169].

Therefore, lifting is widely used for implementation of image processing applications.

| Algorithm 2.1: The lifting scheme for 1-D DWT [171]. |  |
|------------------------------------------------------|--|
| Inputs: Pixel (Xi)                                   |  |

1. Split Xi into X2i i.e. even sample and X2i+1 i.e. odd sample.

2. Predict(P) and update (U): Stage 1

$$P_i^1 = X_{2i+1} + a(X_{2i} + X_{2i+2})$$

$$U_i^1 = X_{2i} + b(P_i^1 + P_{i-1}^1)$$

Predict(P) and update (U): Stage 2

$$P_i^2 = P_i^1 + c(U_i^1 + U_{i+1}^1)$$
$$U_i^2 = U_i^1 + d(P_i^2 + P_{i-1}^2)$$

3. Perform scaling to obtain approximation detail of DWT coefficient by:

$$P_i = \frac{1}{K} \times P_i^2$$
$$U_i = K \times U_i^2$$

The symbols a=1.58613, b=-0.0529, c=0.882911, d=0.44350 and K=-1.1496 are lifting coefficient with constant value.

Outputs: Low-pass coefficient (XI) and High-pass coefficient (Xh).

The fundamental lifting concept is to calculate the trivial wavelet by splitting the initial 1-D host signal into odd and even signal points. Further, the values have then modified by dynamically prediction and updating steps. The lifting algorithm can be described as Algorithm 2.1.



Figure 2.2: Two-level decomposition algorithm for 2-D DWT

The computation of 2-D DWT (lifting scheme) can then easily be done by processing of 1-D wavelet transform along the rows and then through the columns respectively. The Figure 2.2 demonstrates the process of 2-D lifting scheme.

#### 2.5 Embedding Mechanism:

There are various watermarking mechanisms present to embed the watermark in special or frequency domain. Among them, least significant bit (LSB), spread spectrum (SS) modulation, pixel value difference, Reversible contrast mapping (RCM), singular value decomposition (SVD), QIM etc are widely used. In this section, we will study the QIM technique.

#### 2.5.1 QIM Modulation Technique:

The QIM method was first introduced by [172]. The QIM is a class of nonlinear method having the benefits of good rate-distortion and robustness performances.

In QIM technique the host image is quantized with two or more quantizes based on the selected watermark. Whereas in each of all the quantizer has its own index.

The mathematical model of QIM scheme can be expressed by the following equation:

$$H(x,w) = Q_w(x) \tag{2.3}$$

Where  $Q_w(x)$  quantization function of watermark w. The pictorial representation of 2-level quantizar is shown in Figure 2.3. The figure depicts that each sample of host image is quantized to one of the star or circled (with cross) values that implicate binary watermark 1 or 0 respectively.



Figure 2.3: QIM Watermarking

Although this method of watermarking is not appropriate because only by knowing the quantization pattern an unauthentic user can extract the host sample. The most commonly used QIM scheme is known as dither modulation technique.

#### a. Watermark Embedding using QIM Process:

vectors and the following equation:

The embedding process of the original signal (x) is distributed in N samples. Every L samples will carry one watermark bit. The two vectors are defined as follows:

$$d(k,1) = \begin{cases} d(k,0) + \frac{\Delta}{2} & d(k,0) < 0\\ d(k,0) - \frac{\Delta}{2} & d(k,0) > 0 \end{cases}$$

$$k = 1,2,3,\dots, N/L; \qquad \Delta = quantization \ stepsize \end{cases}$$
(2.4)

D(k, 1) can be selected as a pseudo random-noise sequence with the uniform distribution in  $[(-\Delta/2), (\Delta/2)]$ . The length of dithe ther vectors is N/L, which is equal to the number of watermark bits. D(k, 0) and d(k, 1) are exploited to embed, respectively, zero and one bits in the host signal. The original signal is quantized using the above-mentioned two

$$H(x; w_k) = Q_{\Lambda}[x + d(k, w_k)] - d(k, w_k)$$
(2.5)

Where Q  $\Delta(.)$  is the quantization function with a step size of  $\Delta$ , which is defined as follows:

$$Q_{\Delta}(x) = \Delta \times round(x/\Delta) \tag{2.6}$$

Here round (.) the function performs rounding off the results in to its nearest integer value.

FPGA based Data Hiding: Preliminaries, Methods, Model, Tools, and Technique

#### **b.** Watermark Detection:

The QIM decoder performs minimum distance decoding to predict the embedded watermark. In the decoder, two vectors  $H_0$  (k) and  $H_1$  (k) are obtained by embedding respectively zero and one in the received vector  $H^{k}$  and their Euclidean distances from  $H^{k}$  are calculated. The minimum measured distance is considered as the embedded bit. In the case where, each bit of the watermark is embedded in one sample of the host signal (L=1). The mathematical model of the detection process is depicted by the following equation:

$$\widehat{w}_{k} = \arg\min_{i \in \{0,1\}} \left[\widehat{H}(k) - H_{i}(k)\right]^{2}$$
(2.7)

when the embedding process inserts each bit of the message into L samples of the host image then the detection process is based on the sum of the L samples of the vectors  $H_0(k)$ ,  $H_1(k)$  and H'(k); i.e.,

$$\widehat{w}_{k} = \arg\min_{i \in \{0,1\}} \sum_{n=(k-1)L+1}^{kL} [\widehat{H}(k) - H_{i}(k)]^{2}$$

$$k = 1, 2, 3, \dots, N/L$$
(2.8)

Where  $\hat{m}_k$  denotes the  $k^{th}$  extracted bit of the message.

#### 2.6 Measurement of Different Watermarking Properties:

This section briefly quantifies the measure of visual quality, robustness and the security of watermarked image by the mathematical functions.

#### 2.6.1 PSNR and MSE:

The image quality measurement is an open research challenge [173-176] There are several methods have been proposed to measure the image quality like average absolute difference, mean squared error, laplacian mean square error, peak signal to noise ratio (PSNR) etc. [19, 177, 178].

Among them, the most widely used technique is the PSNR measurement to quantify the amount of distortion present in the host image. Mathematically PSNR is defined for the 8-bit grayscale image as:

$$PSNR(dB) = 10\log_{10} \frac{(Peak \ value \ of \ input \ signal)^2}{(Mean \ Square \ Error)}$$
(2.9)

The peak value of input signal for 8-bit grayscale image is 255. The mean square error (MSE) is the calculated average of the difference between the original image  $(X_j)$  and the watermarked image  $(Y_j)$  mathematically mean square error (MSE) is expressed as:

Mean Square Error(MSE) = 
$$\frac{1}{N} \sum_{j}^{N} (Y_j - X_j)^2$$
 (2.10)

#### 2.6.2 SSIM and MSSIM:

All the visual objects like host image or natural scenes are highly structured. The pixels of the visual object have strong dependency on each other that carries the structural information of the visual host.

The structural similarity (SSIM) analyzes the pixel intensities of local patterns that have been normalized for luminance and contrast. SSIM is an effective measurement technique for different kinds of visual host image [179]. A mathematical model of SSIM index between two different images matrix namely H and M have three distinct comparison levels and can be depicted as follows:

$$SSIM(H,M) = [l(H,M)]^{\alpha} \times [c(H,M)]^{\beta} \times [s(H,M)]^{\gamma}$$
(2.11)  
$$l(H,M) = \frac{(2\mu_{H}\mu_{M}+C_{1})}{(\mu_{H}^{2}+\mu_{M}^{2}+C_{1})}, c(H,M) = \frac{(2\sigma_{H}\sigma_{M}+C_{2})}{(\sigma_{H}^{2}+\sigma_{M}^{2}+C_{2})}, s(H,W) = \frac{\sigma_{HM}+C_{3}}{\sigma_{H}\sigma_{M}+C_{3}}$$

where l = luminance, c = contrast, s = structure,  $\mu = local mean intencity$ ,  $\sigma = local standard deviation of the host image, <math>C_1, C_2$  and  $C_3$  are constant,  $\sigma_{HM} = local convariance coefficient between H and M.$ 

Now, if  $\propto = \beta = \gamma =$  unity and  $C_3 = \frac{C_2}{2}$  then the above equation can be simplified as:

$$SSIM(H,M) = \frac{(2\mu_H\mu_M + C_1)(2\sigma_{HM} + C_2)}{(\mu_H^2 + \mu_M^2 + C_1)(\mu_H^2 + \mu_M^2 + C_2)}$$
(2.12)

The overall image quality is evaluated by the mean SSIM or (MSSIM) index.

#### 2.6.3 NCC:

Normalize cross-correlation is abbreviated as NCC. To ensure the efficiency of a data hiding scheme, the developer needs to measure the distance between the inserted watermark W(H, M) and the extracted watermark.

The quality estimation of the extracted watermark W'(H, M) can be performed by a quantitative measure is given as [180]:

Normalize Cross – Correlation (*NCC*) = 
$$\frac{\sum_{H} \sum_{M} W(H,M) W'(H,M)}{\sum_{H} \sum_{M} [W(H,M)]^2}$$
 (2.13)

It is to be noted that for a good quality of extracted watermark should have the unity NCC value.

FPGA based Data Hiding: Preliminaries, Methods, Model, Tools, and Technique

#### 2.6.4 KLD:

Kullback-Leibler distance is abbreviated as KLD. KLD matrix is used to measure the security of the hidden data in a data hiding system [181]. The measurement process measures the distance between two probability distribution functions (true and target) i.e.  $P_{d0}$  and  $P_{d1}$ . By considering two probability densities  $P_{d0}$  and  $P_{d1}$  the KLD can be expressed as:

$$KLD(P_{d1} \parallel P_{d0}) = \int P_{d1}(x) \log \frac{P_{d1}(x)}{P_{d0}(x)} dx$$
(2.14)

#### 2.7 Measurement of Different Properties in VLSI Design:

The properties of VLSI circuit are can be measured from the standpoint of power consumption, the throughput of the system and the area required for circuit implementation. The present section provides an overview of the measuring technique of the said parameters.

#### **2.7.1 Power Consumption:**

The total power consumption in any integrated circuits or field programmable devices are of two types such as static power consumption ( $P_{static}$ ) and dynamic power consumption ( $P_{dynamic}$ ).

The Static power is consumed due to the leakage current flows through the circuit from the power supply (VDD-rail) to the ground (GND-rail) through the circuit component. Whereas the dynamic power that is consumed by a VLSI circuit, depends on the effective capacitance of resources, the resource utilization and the switching activity of resources [**182-184**]. The dynamic power of a circuit is represented by the effective capacitance ( $C_j$ ), input clock frequency ( $f_j$ ) and voltage swing ( $V_J$ ) of j<sup>th</sup> resource. The dynamic power is depicted by, Eq. (9).

$$P_{dynamic} = \sum_{j} \left( C_j \times f_j \times V_j^2 \right)$$
(2.15)

The total consumed power in an electronic circuit can be calculated as:

$$P_{total} = P_{static} + P_{dvnamic} \tag{2.16}$$

#### 2.7.2 Throughput of the System:

The throughput of the embedding and extraction unit is calculated as:

$$Rate(R_{\eta}) = {}^{B_T} / N_C , \qquad (2.17)$$

Where,  $(N_C)$  represents the total execution time for  $(N \times N)$  grayscale image block and  $(B_T)$  is the total number of output bitstream.

## 2.7.3 Area Power Frequency Tradeoff:

The VLSI hardware realization of a data hiding based access control algorithm in FPGA needs an optimized VLSI architecture for the reduction of power consumption and minimum FPGA resource utilization (area).

There are various architectural design strategy can be incorporated by VLSI architects depending upon the goal of the hardware implementation by means of power consumption, area utilization and the system operating speed. All the mentioned three parameters are correlated with each other. The throughput of a VLSI system is related to the operating speed.

To improve the throughput of an FPGA based system, VLSI architects may need to sacrifice the area of implementation [185, 186]. Moreover, the increasing throughput leads the increase in power consumption dynamically [187, 188]. Therefore a trade-off is necessary to design an optimized system by adjusting design parameters [189].

#### 2.8 Various Architectural Design Strategy and Tools:

The hardware architecture refers to how a computing system is designed and it emphasizes the organization of basic components and the behavior of the system. There is a fundamental relationship between the hardware architecture and the performance of the designed hardware.

In order to achieve better performance, it is very important to understand the architectural design strategy of a system [190-192]. There are several architectural strategy namely serial architecture, parallel architecture, and pipelined architecture is used to improve the performance of a hardware system.

#### A. Serial Architecture:

The serial architecture of a hardware system process task one by one, in a sequential manner. For an example, a system is specified to perform a list of assignments and each of them is performed at a time and all other assignments wait until the first one completes. This model of processing is also known as sequential processing. The Figure 2.4 depicts the idea of evaluating a task composed of n-number of assignment serially.



Figure 2.4: Serial processing flow diagram

Although serial processing uses less number of hardware resource but requires huge processing time.

FPGA based Data Hiding: Preliminaries, Methods, Model, Tools, and Technique

## **B.** Parallel Architecture:

The parallel hardware architecture can perform multiple tasks at a same instance of time by utilizing the different hardware resource. It is to be noted that in the case of parallel architecture that utilizes multiple hardware resources to handle multiple tasks at the same instance of time.



Figure 2.5: Parallel processing flow diagram

The Parallel processing system can save the enormous amount of processing time which interns increase the throughputs with the cost of huge hardware resources.

The Figure 2.5 shows the parallel hardware structure process (0 to n) number of the task at the same instance of time.

## **C. Pipeline Architecture:**

There is another hardware architecture which can be used to improve the throughput of a hardware system. Pipelining is the process of accumulating instruction from the processor through a pipeline.

It allows storing and executing instructions in an orderly process. It is also known as pipeline processing. In this system, many instructions are overlaid at the time of execution. A complete task is subdivided into multiple stages.

These stages are connected with one another to form a pipe-like structure. To complete the task instructions enter from one end of pipe and exit from the different end.

The construction and the execution process is depicted in Figure 2.6 (a) & (b) respectively.



FPGA Based Reconfigurable Hardware Architecture for Quality Access Control...

Figure 2.6: (a) pipeline structure (b) Pipelined execution process

#### The Basic Construction of a Pipeline Process Concern with:

- i. The pipelined hardware architecture has one input terminal and the output terminal. The input to output end is partitioned into multiple processing stages/segments such that the output of one stage is connected to the input of the next stage and each stage performs a specific operation.
- ii. Register buffers/latches are employed to hold the intermediate output between two stages.
- iii. A common clock monitors all the pipeline stages along with the register buffer.

#### 2.9 FPGA & FPGA Design Flow:

This section we present the basic structure of Field programmable gate array (FPGA) and the FPGA design flow. FPGA is a reconfigurable integrated circuit. FPGA can be programmed at the user end to implement different logic systems. An FPGA is constructed by integrating thousands of logic logic-cells, which are connected via the programmable interconnections.

These interconnections are configured to implement VLSI systems. The basic FPGA architecture consists of three main components namely, an array of configurable logic blocks (CLBs), programmable interconnects, and Input/output buffers. FPGA structure became complex to improve the design efficiency.

To support the implementation of more complex signal processing operation advance FPGA have been supported by integrated DSP blocks, embedded block-RAMs. The modern FPGA architecture is depicted in Figure 2.7.



Figure 2.7: Advanced FPGA architecture

FPGA based system designer implement the hardware design by following the FPGA design flow [193]. The process steps are presented in Figure 2.8.

All the distinct step of the design flow are governed by an automation tool such as Xilinx ISE or Quartus tools provided by Xilinx or Altera respectively [194, 195].

Firstly a system entity is designed by explaining the circuit behavior with the help of hardware description languages (HDL) such as Verilog or VHDL. A synthesis tool is used to generate the technology-mapped net list for the system.

The netlist can be fitted to the FPGA architecture using a process called place-and-route.

The next design step is to generate a binary programme file known as bitstream and download the same bitstream into an FPGA device with the help of programming cable.

Generally, after the device's programming, circuit verification is done in order to verify the real and final functionality of the design.



Figure 2.8: Overview of FPGA design flow

## 2.10 Chapter Summary:

In this chapter, we have presented the various property of digital data hiding, measurement of those properties, working domain of implementation along with the embedding mechanism. Moreover, we have discussed a brief introduction of FPGA and FPGA based design flow. In section 2.9 introduced an overview of various architectural design strategies for the VLSI implementation of digital data hiding technique. This chapter contributes the primary background with supporting information required for the subsequent discussion. The following chapters will discuss the FPGA based hardware implementation of different data hiding scheme for quality access control of digital images.
FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

Chapter 3

# An Efficient Hardware Architecture for Quality Access Control of Compressed Gray Scale Image on FPGA

# **3.1 Introduction:**

The massive technological advancement in the last few decades increases the use of digital multimedia contents like paper documents, image, audio, and video etc. in every day's modern human life. There is also a growth in the internet and wireless network worldwide offering an omnipresent medium to carry and exchange multimedia data. Moreover, there are many organizations that have moved into E-Commerce and E-Business for initiating an online transaction. There are mainly two objectives of E-Business. Initially the manufacturers of digital media place a great number of valuable products in their website for wide publicity and popularity among the consumers, and secondly, they also want to do quality access control for the general users for their prosperous commercial aspects. The security and fair use of the online portal, as well as fast delivery of multimedia products to a variety of end users or devices with guaranteed QoS, is important but it faces many challenging problems. In this aspect, the quality access control system may find its fundamental importance to the manufacturers of digital media to restrict the QoS depending upon the subscription agreement. Present day's digital media are published in compressed form by following different compression slandered like JPEG or JPEG 2000.

Generally, the access control mechanism finds to be useful in modern age multimedia communication to fulfill the QoS like real-time HD-broadcasting, TV broadcasting, ondemand video and music transmission, online gaming etc. Accordingly, quality access control becomes more popular and researcher contributes different solutions in the literature [6, 64, 196, and 197]. The literature related to this work is already described in Section 1.2.4 of Chapter 1.

Most of the related FPGA based hardware solutions found in the literature works fine but from the implementation point of view, there is ample scope to improve performance. It is also found that most of the previous work used the small size of the image which is not suitable to support quality access control scheme. Moreover, their hardware suffers from huge power consumption and low throughput. Furthermore, that hardware implementation requires huge FPGA resources. This demands improved architectural design strategy to meet the said objective.

In this work, firstly we have designed serial hardware architecture and then parallel hardware architecture to improve the performance of a quality access control scheme in DCT compressed domain. The main design objective is to achieve low power and high throughput with minimum use of FPGA resources.

The system level Implementation of encoder and decoder module is done on Xilinx FPGA (Zynq device family XC7Z010- CLG400). All the corresponding performance analysis of our model is reported to validate the claimed.

# The Main Characteristic of the Hardware Design are:

- a. Larger Size Image: Hardware can support up to  $(512 \times 512)$  sized image.
- b. Minimum FPGA Resource Utilization: Image encoder and decoder module utilize minimum resource components of field programmable gate array (FPGA) devices like occupied slice, slice registers, slice LUTs, BRAMs and DSP blocks in comparison to the others hardware implementation. The major achievement in this context is that the encoding and decoding modules save 84.73% and 23.29% of FPGA resources respectively than the similar implementation found in the literature [76][79].
- c. Power Consumption: The optimized design caused the encoder and decoder architecture requires only 65.47 mW and 78.83 mW powers respectively. The Xilinx X-power analyzer calculates the power performance of the scheme.
- d. Throughput: The throughput of the prototype hardware is 11.37 Megabyte/second (encoder) and 11.37 Megabyte/second (decoder) while operates at the clock frequency of 110.703 MHz and 111.03 MHz respectively.

The rest of the chapter is organized as follows: Section 3.2 describes the passive data hiding based quality access control algorithm. The proposed VLSI architecture of the above passive data hiding algorithm is described in Section 3.3. Section 3.4 presents the simulation results with the discussion. Finally, the chapter summary is presented in Section 3.5.

# 3.2 Passive Data-Hiding Based Quality Access Control Algorithm:

In this section, we describe the passive data-hiding based quality access control algorithm [198] chosen for the VLSI implementation. The scheme has two modules i.e. image encoding and image decoding.

#### **3.2.1 Image Encoding:**

The proposed access control encoding scheme's block diagram representation is shown in Figure 3.1. The encoding process is described in Algorithm 3.1. The encoding process starts with the partitioning of a host image of size  $(n \times n)$  into  $(8 \times 8)$  sized nonoverlapping pixel blocks. Block wise 2D-DCT is then calculated and then quantized for each individual blocks. The quantization method follows the standard quantization table (Qmt) and coefficients are recorded in zigzag order as prescribed in baseline JPEG. A di-bit pattern is generated based on the polarity of 1st two AC coefficients.

This bit pattern represents the percentage of coefficients to be modulated in a block ( $8\times8$ ). This process governs the permissible distortion limit for the control of image quality. In this case, we consider 3-bits pattern to control distortion of an image. For better control user may use more bits. A 20-bit key is formed to govern the coefficient modulation process is depicted in Figure 3.1. Lastly, Quantized coefficients are block-wise modulated by the following equation:

An Efficient Hardware architecture for Quality Access Control of Compressed Gray ......

 $(Z^e) = (-1) \times (Z) \times k ,$ 

(3.1)

Where,  $(Z^e)$  is the modulated coefficients, (Z) is the DCT coefficients and the modulation strength is (k). Modulated coefficients are then transmitted through a transmission channel by using a channel coding technique. We use Huffman coding technique in this design. The key is then tagged with the Huffman coded bit stream before transmission through the channel.

| Algorithm 3.1: Encoding method                                                                             |
|------------------------------------------------------------------------------------------------------------|
| <b>Inputs:</b> Image Pixel(P <sub>IX</sub> ), Standard quantization table (Q <sub>mt</sub> ), Key (20 bit) |
|                                                                                                            |
| 1. Partition host image into non-overlapping blocks of size $(8 \times 8)$ .                               |
|                                                                                                            |
| 2. 2-D DCT is computed for all $(8 \times 8)$ blocks.                                                      |
|                                                                                                            |
| 3. DCT coefficients are quantized by Qmt and restored by zigzag order.                                     |
|                                                                                                            |
| 4. Formation of a 20-bit key.                                                                              |
|                                                                                                            |
| 5. Calculate modulated coefficient $(Z^e) = (-1) \times (Z) \times k$ .                                    |

6. Huffman coding and padding of a 20-bit key.

7. Transmission through the channel.

\*Z<sup>e</sup> = modulated coefficients; Z= DCT coefficients; k= modulation Strength.

Outputs: Modulated and Huffman coded DCT coefficient.



Figure 3.1: Block diagram representation of encoder



Figure 3.2: Structure of 20-bit key

#### **3.2.2 Image Decoding:**

The quality access control decoder is shown in the block diagram as depicted in Figure 3.3.

Algorithm 3.2 describes the decoding process.

The decoder first extracts the key.

The Huffman coded bit streams are then decoded.

The decoded coefficients are then demodulated by the following rule:

$$(Z^{el}) = (-1) \times \frac{(Z^{e})}{k}, \tag{3.2}$$

Where,  $(Z^{el})$  represents the demodulated coefficients,  $(Z^{e})$  represents the modulated coefficients and the symbol 'k' represents the modulation strength.



Figure 3.3: Decoder block diagram

An Efficient Hardware architecture for Quality Access Control of Compressed Gray ......

| Algorithm 3.2. Decoding method                                                       |
|--------------------------------------------------------------------------------------|
|                                                                                      |
| inputs: Huilman coded bit stream                                                     |
|                                                                                      |
| 1. 20-bit key extraction and Huffman decoding.                                       |
|                                                                                      |
| 2. Calculate the demodulated coefficient $(Z^{el}) = (-1) \times \frac{(Z^{el})}{2}$ |
| k k                                                                                  |
| 3 Do quantized by zigzeg seen                                                        |
| 5. De-qualitized by zigzag seall.                                                    |
| 4 IDCT transform                                                                     |
| 4. IDC1 transform.                                                                   |
| ¥76 1                                                                                |
| $*Z^{\circ} =$ demodulated coefficients; $Z^{\circ} =$ modulated                     |
|                                                                                      |
| Coefficients; k=modulation strength.                                                 |
|                                                                                      |
| Outputs: Decoded image pixel.                                                        |

Then de-quantization is performed by Qmt that is used during encoding. To restore the host image data IDCT followed by the de-quantize coefficient is performed [198].

# **3.3 Proposed VLSI Architecture of Passive Data-Hiding Based Quality Access Control Algorithm [199, 200]:**

In this section, we describe the proposed VLSI architecture of passive data-hiding based quality access control algorithm as described in Section 3.2. The importance is given to the algorithm as because the algorithm is simple and easy to incorporate in JPEG platform.

The prototype VLSI architecture of encoder and decoder are described with hardware description language (HDL) and is implemented in FPGA board. FPGA technology is used for designing the prototype of high-performance digital systems, as it significantly reduces the non-recurring engineering (NRE) cost, risk, and time to market.

The digital system design on FPGA indicates different miscellaneous optimization of algorithms and transformations. High ended computer-aided design (CAD) tools are utilized to automate some of the tasks. For the prototype design and synthesis of the proposed access control hardware, the XILINX ISE 14.5 tool is used. The field programmable gate array (FPGA) is a programmable integrated circuit (IC).

FPGA contains a large number of arrays of generic logic cells and are connected through programmable switches. Available logic cells are configured to perform simple functions by customizing programmable switches. It provides the interconnections among the logic cells using hardware description language (HDL) programming. Once the programming and synthesis are completed, a simple adaptor cable is used to download the switch configuration of the desired logic cells into the FPGA device. Thus the custom circuit is obtained.

The access control system (encoder and decoder) is designed by very-high-speedintegrated-circuit-hardware-description-language (VHDL) and is implemented in XILINX Family: Zynq, Device: XC7Z010, Package: CLG400, Speed grade:-3 of FPGA.

#### 3.3.1 RTL View:

The simple RTL representation of the access control encoding system is shown in Figure 3.4. Firstly, the host image pixels are segregated in  $(8 \times 8)$  pixel block and a 2-D DCT transform is computed. The DCT coefficients are temporarily stored in "DCT coefficient buffer" having a capacity of 64 coefficients. Each of the coefficients is of 13 bits in length. The design utilized a 20-bit register to hold the 20-bit key. After quantization and modulation, the coefficients are transferred to the "Modulated Coefficient RAM".



Figure 3.4: RTL view of access control encoder

#### **3.3.2 Data Format:**

The internal computation of the encoding and decoding system utilizes 2's complement bit representation format. The fixed point data format is represented by [1, 9, and 3] that means the most significant bit (MSB) is the sign bit and next 8 bits ( $2^{nd}$  to 9<sup>th</sup> from MSB) represents the integer. The remaining 3-bits (a least significant bit) represent the fractional ranges. The general data format [1, *h*, *m*] is given in Eq. 3.3. The range and precision of the used data format is given by (-512 to 511.875) and 0.125 respectively.

$$-2^{x} \le H \le \frac{2^{h+m}-1}{2^{m}} \tag{3.3}$$

The prescribed data format is selected, as the scheme is implemented for a greyscale image that does not require a wide dynamic range. The output stage converts the 2's complement data into its grayscale binary value.

#### 3.3.3 Architecture of Encoder:

The quality of access control encoder architecture is depicted in Figure 3.5. The architecture consists of five distinct modules such as Image RAM, Pixel Buffer, Block-based DCT, Quantizer & encoder, and Control unit. The host image pixels (N×N) are loaded in Image RAM. Image pixels are then segregated into (8×8) no overlapping blocks and stores in (8×8) Pixel Buffer. This makes the scheme compatible with JPEG format.

An Efficient Hardware architecture for Quality Access Control of Compressed Gray ......

Then the incoming image pixels are DCT transformed and modulated depending upon the strength of modulation as decided by the supplied 20-bit key. Lastly, the modulated DCT coefficients are Huffman coded before transmission through the communication channel.



Figure 3.5: Encoder data path

# 3.3.3.1 Image RAM:

An "Image RAM" chip is designed to store image pixels. In this design, we have developed the Image RAM to store (512×512) sized greyscale image. The image RAM having address line specification of [17:0] and data line specification of 8-bit. When the write enable ( $\overline{WE}$ ) input is high, the "Image RAM" chip fetches the selected byte onto the data bus.

# 3.3.3.2 Pixel Buffer:

The conventional RAM is slow because reading and writing can't be performing simultaneously. That affects the system performance. To speed up the performance and to reduce the logic resource requirement a dual port "Pixel Buffer" of 64 locations is designed and integrated.

The dual port memories have two banks with dual addressing capability. One address bus is used to read 8-bit data from the buffer, while the other one provides the address to write the results to the same buffer.

# 3.3.3.3 Quantizer and Encoder:

"Quantizer and Encoder" block plays a major role too in the encoding process. After computation of 2D-DCT transform, the computed coefficients are stored in the "coefficient buffer". This section generates a quantized DCT coefficient by reading them from "coefficient buffer" and then modulated the coefficients by the user-defined modulation key. The modulation process is elaborated in the Algorithm 3.3.

The quantization process is governed by the standard quantization matrix as stores in "quantization Table" RAM. To maintain the minimal loss in quality we use the same quantization table as prescribed by JPEG.

After quantization and modulation, the modified coefficients are then transferred to a dual port RAM named "zigzag matrix RAM".

| Algorithm 3.3: Coefficient modulation process []                                                                                                                                               |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>Input:</b> Quantized DCT coefficient, K=modulation strength.                                                                                                                                |
| Output: Modulated DCT coefficients.                                                                                                                                                            |
| Start                                                                                                                                                                                          |
| Store all the DCT coefficient in RAM                                                                                                                                                           |
| Set a counter $c1 = 64$                                                                                                                                                                        |
| for(c1 = 1:64)                                                                                                                                                                                 |
| t=0                                                                                                                                                                                            |
| if(DCT coefficient $\neq 0$ )                                                                                                                                                                  |
| t = t + 1                                                                                                                                                                                      |
| if t > 4                                                                                                                                                                                       |
| Sore $\overline{D_{12}}$ bit of 2 <sup>nd</sup> and 9 <sup>th</sup> value of RAM in S.//Symbol $\overline{D_{12}}$ represents the //MSB of the 2 <sup>nd</sup> and 9 <sup>th</sup> coefficient |
| If $S = "00"$                                                                                                                                                                                  |
| Set the counter $C = 8$                                                                                                                                                                        |
| for $(C = 1.8)$                                                                                                                                                                                |
| Modulated DCT coefficient = $[(-1) \times Ouantized DCT coefficient \times K]$                                                                                                                 |
| If $S = "01"$                                                                                                                                                                                  |
| Set the counter $C = 16$                                                                                                                                                                       |
| for(C = 1:16)                                                                                                                                                                                  |
| Modulated DCT coefficient = $[(-1) \times \text{Ouantized DCT coefficient} \times \text{K}]$                                                                                                   |
| If $S = "10"$                                                                                                                                                                                  |
| Set the counter $C = 24$                                                                                                                                                                       |
| for(C = 1:24)                                                                                                                                                                                  |
| Modulated DCT coefficient = $[(-1) \times \text{Quantized DCT coefficient} \times \text{K}]$                                                                                                   |
| If S = "11"                                                                                                                                                                                    |
| Set the counter $C = 32$                                                                                                                                                                       |
| for(C = 1:32)                                                                                                                                                                                  |
| Modulated DCT coefficient = $[(-1) \times \text{Quantized DCT coefficient} \times \text{K}]$                                                                                                   |
| Stop                                                                                                                                                                                           |

The speed performance of the encoder module is improved by the parallel architecture as shown in Figure 3.6.

The construction of the parallel processing of the coefficient modulation unit is simple and is constructed by serial in parallel out (SIPO) shift register.

Based on the user defined 20-bit key, modulation strength (MT) is selected and multiplied in a parallel way.

In this way, the parallel modulator architecture modulates the eight quantize coefficients in a single clock pulse to improve the overall speed of the system. For example, to modulate 64 coefficients using serial architecture, the scheme requires 64 clock cycles, while the parallel architecture requires only 8 clock cycle. In other words, the parallel processing saves 87.5% of the clock using only 8 clock cycles.



Figure 3.6: Co-efficient modulation unit by parallel processing

# 3.3.3.4 DCT/IDCT:

VLSI implementation of the DCT/IDCT module was given immense importance to get the benefits like less complex and improved system performance. The main characteristic of our DCT/IDCT block is given as: (a) it is a multiplier less strategy; (b) it can be computed in parallel. DCT/IDCT is computed based on distributed arithmetic (DA) approach as mentioned in [201]. First, 1D DCT is performed along the row and the other along the column, based on row-column distributed arithmetic, as proposed by [201].

The constructional details of  $(8\times8)$  block based 2-D DCT/IDCT module is shown in Figure 3.7(a) and the pin details are depicted in Figure 3.7(b), respectively. In order to improve the clock requirement, resource requirement and power consumption, we optimize the DCT/IDCT module architecture proposed by [166]. If 'X\_in valid' pin is active high, eight image pixels are transferred to '1D-DCT/IDCT' block to perform DCT/IDCT. The 1D-DCT is calculated for the first row and the output is forwarded to a serial-in-parallel-out transpose buffer.

After the completion of consecutive eight executions cycle, the transpose buffer is filled up with row-wise 1-D DCT coefficients and the 'X\_in ready'=1.



| Pin Name    | Pin Type   | Description                                             |
|-------------|------------|---------------------------------------------------------|
| Clk, rst,   |            | Clock, and rest input;                                  |
| X_in valid, | Input      | Input pixel control signal to each 1-D DCT/IDCT module; |
| X_in ready  |            | Initiates second level 1-D DCT/IDCT                     |
| X_out Valid | Output     | Validation signal for output co-efficient.              |
| x0 to x7    | Input bus  | 8-bit input line                                        |
| y0 to y7    | Output bus | 13-bit DCT/IDCT co-efficient output.                    |

(b)

#### Figure 3.7: (a) RTL view of 2D-DCT/IDCT module, (b) Port description

Then 'X\_in ready' initiates the column-wise transform to compute the second level of 1-D DCT. The required latency of the DCT/IDCT module is only 65 clock cycle to complete execution of  $(8 \times 8)$  pixel block.

#### **3.3.3.5** Control Unit of Encoder:

Control unit provides appropriate timing sequences to perform the encoding operation. The Finite State Machines (FSM) based control unit controls the overall process by providing predetermined timing and control signal. The total encoding operation is controlled by two different FSM. The FSM control states are depicted in Figure 3.8(a) & (b). The Figure 3.8(b) is the secondary FSM control that is nested in ST3' of main FSM as shown in Figure 3.8(a). The main FSM is performing encoding operation in three different states. The *ST1'* state initiates the process. The *ST2'* the state performs the block segmentation and supply (8×8) pixel block to *ST3'*.



An Efficient Hardware architecture for Quality Access Control of Compressed Gray ......

Figure 3.8: FSM controlled machine: (a) Main control FSM; (b) Nested FSM for ST3'

The state ST3' is nested with seven states FSM. The first two state of encoding process deals with reading and storing operation of segmented image (8×8) block in to a block RAM.

In state (ST3), 2-D DCT is computed and is stored in another block memory. State 4 (ST4) performs the quantization of DCT coefficients.

The modulation key is generated in the state (ST5) and based on the Eq. (3.1) incoming coefficients are modulated and then are stored in "Zigzag Matrix RAM".

The last two states S6 and S7 are then combined to produces Huffman code and pads the 20-bit key with the Huffman coded coefficient before transmission over the communication channel.

# **3.3.4 Architecture of Decoder:**

The decoder performs the opposite operation to that of encoding. The construction of the decoder data path is shown in Figure 3.9.

The data path of the decoder unit is the exact replica of the encoding unit except for the IDCT and demodulator block.

The decoder data path recovers the 20-bit key and then demodulates the encoded pixel using Algorithm 3.4.

The Huffman decoder decodes the encoded coefficient and stores them in a "Zigzag Matrix RAM" whereas Demodulation and de-quantization are performed in the demodulator block.

The same quantization matrix is used for the de-quantization of DCT coefficients. The coefficients are then temporarily stored in the block RAM of size ( $8\times8$ ), before performing IDCT.

To speed up the decoding process a symmetrical parallel architecture of a decoder module is implemented.

The parallel processing architecture is shown in Figure 3.10. Similar kind of serial in parallel out (SIPO) shift register based architecture is used to implement a parallel data path that improves overall latency of the system.



Figure 3.9: Decoder data path

An Efficient Hardware architecture for Quality Access Control of Compressed Gray......

Algorithm 3.4: Coefficient demodulation process Input: Modulated DCT coefficient, K Output: Demodulated DCT coefficient. Start Store all modulated DCT coefficient in RAM Set a counter c1 = 64for(c1 = 1:64)t=0 if (DCT coefficient  $\neq 0$ ) t = t + 1if t > 4Sore  $\overline{D_{12}}$  bit of 2<sup>nd</sup> and 9<sup>th</sup> value of RAM in S.//Symbol  $\overline{D_{12}}$  represents // the MSB of 2<sup>nd</sup> and 9<sup>th</sup> coefficient. If S = "00"Set the counter C = 8for(C = 1:8)Demodulated DCT coefficient =  $[((-1) \times Modulated DCT coefficient)/K]$ If S= "01" Set the counter C = 16for(C = 1:19)Demodulated DCT coefficient =  $[((-1) \times Modulated DCT coefficient)/K]$ If S = "10"Set the counter C = 24for(C = 1:24)Demodulated DCT coefficient =  $[((-1) \times Modulated DCT coefficient)/K]$ If S = "11" Set the counter C = 32for(C = 1:32)Demodulated DCT coefficient =  $[((-1) \times Modulated DCT coefficient)/K]$ Stop





#### **3.3.4.1 Decoder Control Unit:**

The decoder also uses a Finite State Machine (FSM) to control decoding operation. The state machine provides legitimate timing sequences to synchronize the operation of different blocks of decoder data path. The decoder FSM consists of five state i.e. ST0 to ST1 is shown in Figure 3.11. The decoding operation is started with the "bit\_encoder\_valid"=1 signal. The ST0 state extracts the 20-bit modulation key from the incoming bit sequence. The other parts of modulated bits string are transferred to the Zigzag Matrix RAM. ST1 state decodes the bit-stream based on the recovered key. Dequantization is performed in ST2 state using the same.



**Figure 3.11: State machine for the decoder** 

Quantization table, used in the encoder. The de-quantized DCT coefficients are then sent to block buffer to store for further processing. ST3 and ST4 states calculate IDCT to obtain the original image pixel.

#### **3.4 Performance Evolution:**

The designed hardware performance will be elaborated in this section. The scheme is designed and tested for different grayscale image benchmarked images of sized ( $512 \times 512$ ). The design is simulated using Intel(R) Core(TM) i3-4005U CPU, 1.7 GHz processor, with 8 GB RAM, 64 bit OS using Xilinx tools (ISE Design Suite 14.5). The prototype hardware is implemented and performance is analyzed on FPGA device (Xilinx Zynq-XC7Z010-CLG400-3).

The said FPGA is selected to fit for the complexity of the proposed system. This section deals with the narration of the FPGA based hardware realization results in terms of resource utilization, power consumption, and throughput. The hardware simulation results are obtained and verified by the Xilinx ISE simulator in VHDL platform.

A prototype VLSI circuit of access control system for grayscale image is implemented and tested in two way (a) serial modulator and (b) parallel modulator. The HDL simulation result of the VLSI circuit component requirement is reported in Table 3.1.

|       |           | Encoder M                | odule of I | mage Size | e (512×512) |              |      |
|-------|-----------|--------------------------|------------|-----------|-------------|--------------|------|
| Units | RA-<br>MS | Adders/<br>Substructures | Counters   | Registers | Comparators | Multiplexers | FSMs |
| No.   | 78        | 100                      | 10         | 877       | 11          | 194          | 2    |
|       |           | Decoder M                | odule of I | mage Size | e (512×512) |              |      |
| No.   | 76        | 85                       | NA         | 92        | 3           | 924          | 1    |

Table 3.1: Macro statistics for an image block of size (512×512).

The FPGA device consists of the huge amount of reconfigurable resources like look-up tables (LUT), a huge number of configurable logic blocks (CLB), Flip-Flops, Block Buffers (BUFG), inbuilt BRAMS (block RAMs), Multiplexers (MUX) and programmable switching matrix etc. Those resources are used to reconfigure the hardware as per the HDL program. The efficient architectural design is evaluated by the cost resource utilization. The Table 3.2 summarized the FPGA based resource utilization of encoder and decoder.

|                                              | Target bo                         | ard details                     | s: Zynq-X                        | KC7Z010-       | CLG400 | (-3)                      |                               |
|----------------------------------------------|-----------------------------------|---------------------------------|----------------------------------|----------------|--------|---------------------------|-------------------------------|
| Image Size<br>(512×512)                      | Number<br>of<br>Occupied<br>Slice | Number<br>of Slice<br>Registers | Number<br>of 4-<br>input<br>LUTs | Bonded<br>IOBs | BRAMs  | Number<br>of DSP<br>48E1s | Maximum<br>Frequency<br>(MHz) |
| Encoder                                      | 3533                              | 982                             | 690                              | 32             | 1      | 9                         | 113.02                        |
| Encoder(with<br>parallel<br>implementation)  | 3565                              | 992                             | 698                              | 32             | 1      | 9                         | 110.703                       |
| Decoder                                      | 6123                              | 1011                            | 851                              | 13             | 1      | 9                         | 113.673                       |
| Decoder (with<br>parallel<br>implementation) | 6149                              | 1092                            | 865                              | 13             | 1      | 9                         | 111.03                        |

Table 3.2: Resource utilization summary of the image (512×512) encoder and decoder.

The power performance of encoder and decoder module is calculated by a graphical power synthesizer tool of Xilinx FPGA named as Xilinx-X Power Analyzer (XPA). The measurement of power consumption with respect to a number of clock requirements is tabulated in Table 3.3. It is found that an encoder hardware unit requires only 65.20 mW (serial modulator) and 65.47mW (parallel modulator) of power respectively. The decoder hardware unit requires the power of 78.47 mW (serial modulator) and 78.83mW (parallel modulator) respectively. Fast hardware execution time offers a great advantage in a real-time application. The total clock requirement and at the same time the throughput of the encoding/decoding system is computed.

| Sr.<br>No. |                                 | Encoder | Encoder (Parallel implementation) | Decoder | Decoder(Parallel implementation) |
|------------|---------------------------------|---------|-----------------------------------|---------|----------------------------------|
| 1          | Power (mW)                      | 65.20   | 65.47                             | 78.47   | 78.83                            |
| 2          | Hardware execution<br>time (ms) | 23.97   | 23.04                             | 23.83   | 21.70                            |
| 3          | Total clock required            | 2709504 | 2550816                           | 2708888 | 2550200                          |
| 4          | Data rate<br>Megabytes/second   | 10.93   | 11.376                            | 11      | 11.413                           |

Table 3.3: HDL summary of power consumption, timing report and data rate.

The throughput is calculated by equation (2.17). The result is presented in Table 3.3. It is evident that parallel implementation provides better throughput than serial implementation.

The result is shown in the 4th row of Table 3.3. For further performance improvement of the real-time application, VLSI designer needs to consider full parallel processing or a composition of parallel and pipelined architecture.

Table 3.4: Comparison of results in terms of resource utilization and power consumption. A: Schemes, B: Domain and Image Size, C: Operation, D: Occupied Slice, E: Slice Registers, F: Slice 4-input LUTs, G: BRAMs, H: Total power (mW), I: Clk (MHz).

| Α                                      | В                | С                                          | D     | E     | F     | G   | Н     | Ι       |
|----------------------------------------|------------------|--------------------------------------------|-------|-------|-------|-----|-------|---------|
| A. D.<br>Darji et<br>al. [ <b>79</b> ] | DWT<br>(Haar)    | Embedding<br>(Pipelined<br>Implementation) | 1729  | 900   | 3153  | 305 | 103   | 87.82   |
|                                        | (512×512)        | Embedding<br>(Parallel<br>Implementation)  | 2002  | 1027  | 3453  | 305 | 99    |         |
| Maity et                               | Spatial          | Encoding                                   | 9881  | 9347  | 11291 | 3   | 750   | 98.7    |
| al. [ <u><b>76</b>]</u>                | (32×32)          | Decoding                                   | 14600 | 12531 | 28753 | 3   |       |         |
|                                        |                  | Encoding (serial modulator)                | 3533  | 982   | 690   | 1   | 65.20 | 113.02  |
|                                        | DOT              | Decoding (serial demodulator)              | 6123  | 1011  | 851   | 1   | 78.47 | 113.673 |
| Proposed                               | DC1<br>(512×512) | Encoding<br>(parallel<br>modulator)        | 3565  | 992   | 698   | 1   | 65.47 | 110.703 |
|                                        |                  | Decoding<br>(parallel<br>demodulator)      | 6149  | 1092  | 865   | 1   | 78.83 | 111.03  |

The performance parameter of our design is found to be superior to the different similar kind of implementation found in the literature is tabulated in Table 3.4.

The proposed hardware utilizes about 84.73% fewer FPGA resources than [<u>76</u>] and also 23.29% fewer resources than [<u>79</u>].

Moreover, the enormous powers saving of 90.42% than [76] and 36.69% than [79] are also achieved.

The throughput of the proposed work is also performed better in terms of the other related FPGA based implementation is as shown in Table 3.5.

# Table 3.5: Throughput comparison table.

| Scheme                               | Design Type | Throughput        |
|--------------------------------------|-------------|-------------------|
| Proposed scheme (parallel modulator) | FPGA        | 11.403Megabyte/s  |
| Maity and Kundu [ <u>71</u> ]        | FPGA        | 4.706 Megabits/s  |
| Maity et al. [ <b><u>76</u></b> ]    | FPGA        | 1.0395 Megabits/s |

The main advantageous achievement of this design is achieved due to the fact:

(i) Resource utilization factor is extremely fewer due to the innovation of architectural implementation,

(ii) The use of low power and multiplier-less DCT/IDCT architecture improve the speed-power performance considerably by computing 2D-DCT/IDCT using only 64 clock pulses,

(iii) The improvement in efficiency of the system is due to the parallel implementation of modulator and demodulator.

The Figure 3.12 shows the simulation results of the access control system. The Fig. 3.12(a) shows the readout image pixel from an  $(8 \times 8)$  block.

Then block-based DCT transform is done on image pixels. Based on the proposed algorithm, DCT coefficients are modulated using the 20-bit key.

The Figure 3.12(b) depicted that modulated coefficient are Huffman coded and transmitted along with the 20-bit key.

The valid decoded pixel is displayed on the decoder\_pixel line with decoder\_pixel\_valid=1signal as shown in Figure 3.12(c).





Figure 3.12: HDL simulation result of image (512×512) encoder: (a) Pixel from block memory; (b) Huffman coded bit stream along with 20-bit modulation key; (c) Decoded pixel

# 3.5 Chapter Summary:

This chapter described the proposed hardware implementation technique for image quality access control in DCT compressed domain. FPGA based serial and parallel hardware is implemented and the model's performance is reported. It is found that the serial hardware implementation of 8-bit grayscale image ( $512 \times 512$ ) encoder (in Zynq series FPGA) requires only 982 slice registers and 690 4-input LUTs.

The same architecture consumes 65.20 mW powers while operating at a frequency of 113.02 MHz. The serial implementation of decoder requires 1011 slice registers and 851 4-input LUTs having 78.47 mW of power consumption at 113.673 MHz operating frequency.

Side by side, the FPGA (in Zynq series FPGA) based prototype hardware synthesis of parallel encoder result requires only 992 slice registers and 698 4-input LUTs having 65.47 mW power consumption at 110.703 MHz operating frequency. The implementation of decoder requires 1092 slice registers and 865 4-input LUTs having 78.83 mW of power consumption while operating at a frequency of 111.03 MHz. The major achievement of the implemented hardware architecture achieves a very high throughput with very minimum power consumption.

#### An Efficient Hardware architecture for Quality Access Control of Compressed Gray......

The achieved characteristic makes the proposed architecture suitable for portable image quality access control application. Although the designed hardware provides good data rate and low power consumption still there is ample scope to improve the hardware performance. Next chapter we have discussed an implementation of quality access control hardware for a greyscale image with parallel processing to achieve higher data rate.

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

# Chapter 4

# Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control of Grayscale Image Based on FPGA

# 4.1 Introduction:

Because of the huge progression in technology, the digital media components like images, text, audios, and videos can be accessed, duplicated and stored without any quality deprivation. The manufacturers need to place their huge amount of expensive works on the website for wide advertising.

In the same time, for the commercial reimbursements, they want to confine the users from accessing the full quality of their creative work. As a solution to this problem, a scheme is required which allows all receivers to access an image with low quality or less commercial value. On the other hand, a genuine user can access a higher quality image depending upon the user subscription agreements. The complete description of access control techniques and its different hardware implementation is already discussed in chapter 1.

In this chapter, we proposed a VLSI implementation of adaptive QIM based data hiding scheme in DCT domain for quality access Control of images of size  $(512 \times 512)$ . The DCT domain is selected here for two reasons i.e. (a) about 90% of multimedia objects are still in DCT compressed form, (b) the general study of the human visual system (HVS) in this domain.

# The Main Characteristic of The Hardware Design Is Summarized As:

- a. *Image Size:* Hardware can support up to  $(512 \times 512)$  sized image.
- **b.** Serial and parallel architecture: Both serial and parallel architecture modules are implemented.
- **c.** *Minimal Resource Utilization:* The hardware design is optimized by advance Xilinx synthesis technology (XST) tool that offers minimal FPGA resource to implement the access control hardware. It is found that encoding and decoding scheme saves 90.94 % and 77.57 % of FPGA resource, respectively than the related work found in the literature [76].
- **d.** *Higher Throughput:* The proposed encoding and decoding scheme provides a high throughput of 1.34 Gigabyte/s while operating at 131.167 MHz clock frequency.
- e. *Very Low Power Consumption:* The implementation of the encoding and decoding scheme utilized only 78.45 mW and 78.57 mW of power, respectively which is 90% superior to [76].

The rest of the chapter is organized as follows: Section 4.2 describes the adaptive dither modulation based access control scheme, while in Section 4.3, the VLSI architecture of the scheme as described in Section 4.2, is proposed. In Section 4.4, the performance evaluation is demonstrated and finally, summary is presented in Section 4.5.

# 4.2 Access Control Scheme:

In the present section, we describe the adaptive dither modulation based quality access control algorithm [202] chosen for the hardware implementation.

# 4.2.1 Watermark Encoding:

The encoding scheme hides a binary watermark into the host image using QIM. Firstly a binary watermark 'w' permuted with the supplied secret key (K) before embedding into the host image. The host image blocks wise segmented. The variance value of each ( $8\times8$ ) sized no overlapped segmented block are calculated. Based on the variance value, the blocks are categories into 't' different categories and step size ( $\Delta$ ) is calculated to generate binary dither for QIM. 2-D DCT is then performed on each block.

As it is well known that the detection error rate is inversely proportional to the standard deviation so to improve the detection quality, the scheme selects small step size ( $\Delta$ ) for large variance block [202]. The block diagram of the watermarking encoder is shown in Figure 4.1 and the detail watermarking process is summarized in Algorithm 4.1 [202].

To embed a watermark using QIM process dither sequences i.e.  $d_{t, q}(0)$  and  $d_{g, q}(1)$  are generated pseudo-randomly based on a key. The Eq. (1) and (2) shows the dither sequence generation process. The step size ( $\Delta$ ) is represented by  $\Delta_t$  for the category's'.

Algorithm 4.1: The watermark encoding scheme.

Inputs: Watermark (w), A secret key (K), Host image pixel (P<sub>x</sub>).

1. Calculate the permuted watermark (w').

2. Partitioned the host image into  $(8 \times 8)$  block.

3. Calculate block variance  $(V^2) = \frac{1}{P_x} \sum_{i=1}^{P_x} (x[i] - \mu)^2$  and based on the values of 'V<sup>2</sup>',

categories the blocks into't' categories. The symbol  $'\mu'$  is the average pixel intensity of a block.

4. Select different step-size ( $\Delta$ ) for different categories of blocks.

5. Binary dither sequences  $d_{t,q}(0)$  and  $d_{t,q}(1)$  are generated based on the key  $\Re(\text{key})$  and the step size ( $\Delta_t$ ) as described in Eqs. (1) & (2).

6. Calculate the 2-D DCT of each block.

7. The permuted watermark (w') is inserted into each block  $(8 \times 8)$  in the DCT domain using Eq. (3).

8. Compute 2D-IDCT of each block.

Outputs: Watermarked image pixel.



Figure 4.1: Block diagram of watermark encoding process

The symbol  $\Re(\text{key})$  generates the pseudo-random number and  $\Delta_t/2$  is the distance between  $d_{t,q}(0)$  and  $d_{t,q}(1)$ . The dither sequence  $d_{t,q}(0)$  is used for embedding watermark bit '0', while  $d_{t,q}(1)$  is used for embedding watermark bit '1'.

$$d_{t,q}(0) = \{\Re(key) \times \Delta_t\} - \Delta_t/2 \quad 0 \le q \le L - 1,$$

$$(4.1)$$

$$d_{t,q}(1) = \begin{cases} d_{t,q}(0) + \Delta_t/2 & \text{if } d_{t,q}(0) < 0\\ d_{t,q}(0) - \Delta_t/2 & \text{else} \end{cases}$$
(4.2)

Then 'AC'(alternating current) frequency components of each block are modulated using Eq. (4.3). The symbol 'L' is the length of dither sequences i.e. a number of AC coefficients in a block.

$$D_{q} = \begin{cases} Q\{x_{q} - k' \times d_{t,q}(0), \Delta_{t}\} + d_{t,q}(0) \text{ if } w'(i,j) = 0, \\ Q\{x_{q} + k' \times d_{t,q}(1), \Delta_{t}\} - d_{t,q}(1) \text{ else,} \end{cases}$$
(4.3)

Where,  $x_q$  denotes the q-th frequency component, the uniform quantization factor is denoted by Q, the symbol k' is the constant multiplication factor (in our case, k' = 2) and the modulated coefficients are represented as  $D_q$ . Then inverse DCT (IDCT) is performed to obtain a watermarked image.

#### 4.2.2 Watermark Decoding:

The decoder performs the inverse operation to get a higher quality of host image. The block diagram of the watermark decoder is shown in Figure 4.2. And the detail decoding process is summarized in Algorithm 4.2.

In order to extract the watermark bit from a block, the scheme first calculates the distances i.e.  $A_L$  and  $B_L$ , based on the dither sequences,  $d_{t,q}(0)$  and  $d_{t,q}(1)$  as described in Eqs.(4.4) & (4.5).

The large value of distance indicates the lower probability of detection error for watermark bit.

$$A_{L} = \sum_{q=0}^{L-1} \left( \left| Q \left( y_{q} - d_{t,q}(0), \Delta_{t} \right) + d_{t,q}(0) - Y_{q} \right| \right), \tag{4.4}$$

$$B_L = \sum_{q=0}^{L-1} \left( \left| Q \left( y_q + d_{t,q}(1), \Delta_b \right) - d_{t,q}(1) - Y_q \right| \right), \tag{4.5}$$

The symbol  $y_q$  represents the *q*-th DCT coefficient. The DCT coefficient may be distorted in the received signal.

The extracted watermark bit  $\tilde{w}(i, j)$  for a block (i, j) is decoded by the following rule:



Figure 4.2: Block diagram of watermark decoding process

Algorithm 4.2: The watermark decoding scheme

**Inputs:** Watermarked image, Secret key (K), Step size  $(\Delta)$ .

1. Partitioned the watermarked image into non-overlapping  $(8 \times 8)$  blocks.

2. Do Step 3 to Step 6 of Algorithm 1.

3. Calculate the minimum distances i.e.  $A_L$  and  $B_L$  using Eq. (4) and Eq. (5).

4. Extract transmuted watermark in the form of '0' and '1' using Eq. (6).

5. Eliminate self-noise that is inserted due to data embedding based on Eq. (7).

6. A reverse permutation is done on the extracted watermark to construct the original watermark.

7. Compute 2D-IDCT to obtained host image with improved quality.

**Outputs:** Decoded host image with improved quality, Decoded watermark.

The self-noise due to watermark embedding in a block (i, j) is suppressed to get a better quality of the image and is represented by Eq. (4.7).

$$y'_{q} = \begin{cases} y_{q} + (k' - 1) \times d_{t,q}(0) & \text{if } \widetilde{W}(i,j) = 0\\ y_{q} - (k' - 1) \times d_{t,q}(1) & \text{else} \end{cases},$$
(4.7)

Where,  $y_q$  represents the watermarked DCT coefficient after noise elimination. Block based IDCT is then performed to obtain the high quality decoded host image. The extracted watermark ( $\tilde{W}$ ) bits are then reversely permuted and are XORed with random bits to obtain the decoded watermark ( $\hat{W}$ ). The random bits are generated using the same secret key (K) that was used during the time of watermark permutation at the encoder.

#### 4.3 Proposed VLSI Architecture [203]:

This section presents the proposed hardware architecture of adaptive QIM based data hiding scheme for quality access control of images as described in Section 4.2. The prototype development of a large system on FPGA involves many complex transformations and optimization algorithms. A software tool like XILINX ISE 14.5 is used to automate and implement the proposed design by FPGA.

The FPGA module is a collection of generic logic cells and programmable switches. A number of different functions can be realized by the configuration of the programmable switch. First, the prototype circuit is described by the hardware description language (HDL). Then Xilinx XST tool is used to synthesize the hardware. The synthesized design is then downloaded into the FPGA device to obtain a custom circuit.

Here, an encoder and a decoder as described in Section 4.2, are designed using very-high-speed-integrated-circuit-hardware-description-language (VHDL) and is implemented in XILINX Virtex 7 (XC7VX330T-FFG1157) FPGA.

The hardware architecture of the encoder and decoder module is constructed by different circuit module. The main construction blocks of the encoder are 'DCT/IDCT', 'Variance Calculation', 'Dither Generation and Watermark Permutation', and 'Watermark Embedding'.

The overall encoding process is controlled by a well-defined finite state machine (FSM). The FSM provides the legitimate timing and control signal to operate the watermarking process. Decoder module is an exact replica of encoder except for the 'Watermark Extraction and Self-Noise Suppression' block.

All the individual modules are designed and tested separately and then combined together to perform watermark encoding and decoding operations.

# 4.3.1 Data Format:

The internal data bit representations for the inherent calculation of encoding and decoding process plays a vital role in hardware design. Generally, hardware implementation utilizes fixed point signed 2's complement data format [1, 9 and 7]. It signifies that the 1st bit i.e. most significant bit (MSB) is treated as a sign bit.

The next eight bits starting from 2nd bit position to 9th bit position represent the integer. The fractional part is represented by the last seven bits.

The range of the data format [1, x, y] is set by Eq. 3.3 as depicted in Chapter 3. Moreover, the range of the data format is from -512 to 511.9921875, with a precision of 0.0078125.

# 4.3.2 Watermark Encoder Architecture:

The real-time implementation of the encoder data-path is shown in Figure 4.3. The host image  $(N \times N)$  pixels are then stored in "Image RAM" from an image data acquisition system and at the same time watermark is stored in read-only memory 'ROM-2'.

The encoding process fetches watermark from the 'ROM-2' during the watermarking process. The host image is divided into non-overlapping blocks of size (8×8). Based on the value of variance, image blocks are categorized into four sets and four different step-sizes ( $\Delta$ ) are adapted to generate four different dithers for watermark encoding and decoding.

Then the DCT is computed for each  $(8 \times 8)$  block. Based on the algorithm described in Section 4.2.1 watermark bit is embedded into the DCT coefficients of the host image.

The watermarked image pixels are then formed by IDCT transform. It is needless to mention here that  $(8\times8)$  block size is selected to make the scheme compatible with joint-photographic-experts-group (JPEG) codec.



\*CU= Control Unit, M=ROM

#### Figure 4.3: Watermark encoder data-path

The detailed discussion of watermark embedding hardware is given below.

#### 4.3.2.1 Image RAM Block:

The 'Image RAM' chip is configured to store data that is acquired from the image acquisition system.

The address bus (having width [17:0] of RAM chip) provides the address for the grayscale image of size (512×512), 8-bit.

It is to be noted that the 'Image RAM' chip fetches the selected byte onto the data bus when the write enable  $(r_w)$  input is high.

The register transfer logic (RTL) view along with its port details are presented in Figure 4.4(a).



Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control......

# Figure 4.4: (a) The RTL view of 'Image RAM' block and control signal description of 'Image RAM'; (b) The RTL view of 'Block Buffer' and control signal description of 'Block Buffer'

# 4.3.2.2 Block Buffer:

A dual port block memory (i.e. dual port RAM) is designed to speed up the process and also to reduce the overall logic requirements to store the intermediate results.

The 'Block Buffer' has  $(8 \times 8)$  locations that can be accessed by dividing the memory space into two memory banks. Each bank is addressed by the different addressing port.

The speedup is achieved by reading and writing operations, simultaneously. Figure 4.4(b) depicts the RTL view of 'Block Buffer' along with its port details.

# 4.3.2.3 DCT/IDCT Block:

The basic block diagram of DCT consists of two different 'one dimensional' (1D) transforms of an image matrix.

First, 1D DCT is performed along the row and then along the column, based on fast DCT algorithm proposed by [201].

The factorization of the 2D-DCT matrix is computed by the even and odd indexed coefficients of 1D-DCT and is shown in Eqs. (4.9) & (4.10).

$$\begin{bmatrix} X_{0} \\ X_{2} \\ X_{4} \\ X_{6} \end{bmatrix} = \begin{bmatrix} \cos(\pi/4) & \cos(\pi/4) & \cos(\pi/4) & \cos(\pi/4) \\ \cos(\pi/8) & \sin(\pi/8) & -\sin(\pi/8) & -\cos(\pi/8) \\ \cos(\pi/4) & -\cos(\pi/4) & -\cos(\pi/4) & \cos(\pi/4) \\ \sin(\pi/8) & -\cos(\pi/8) & -\cos(\pi/8) & \sin(\pi/8) \end{bmatrix} \begin{bmatrix} x_{0} + x_{7} \\ x_{1} + x_{6} \\ x_{2} + x_{5} \\ x_{3} + x_{4} \end{bmatrix},$$
(4.9)
$$\begin{bmatrix} X_{1} \\ X_{3} \\ X_{5} \\ X_{7} \end{bmatrix} = \begin{bmatrix} \cos(\pi/16) & \cos(\pi/16) & \sin(\pi/16) \\ \cos(\pi/16) & -\sin(\pi/16) & -\cos(\pi/16) & -\sin(\pi/16) \\ \sin(\pi/8) & -\sin(\pi/16) & -\cos(\pi/16) & \sin(\pi/16) \\ \sin(\pi/8) & -\sin(\pi/16) & \cos(3\pi/16) & \cos(3\pi/16) \\ \sin(\pi/8) & -\sin(\pi/16) & \cos(3\pi/16) & -\cos(\pi/16) \end{bmatrix} \begin{bmatrix} x_{0} - x_{7} \\ x_{1} - x_{6} \\ x_{2} - x_{5} \\ x_{3} - x_{4} \end{bmatrix},$$
(4.10)

where,  $(X_0, X_2, X_4, X_6)$  are the even 1D coefficients and  $(X_1, X_3, X_5, X_7)$  are the odd 1D coefficients for an 8-bit input vector ' $x_0, x_1, x_2, x_3, \dots, x_7$ '.

A special effort has been made to develop DCT and IDCT hardware to achieve higher throughput than software realization.

Moreover, the special purpose DCT hardware reduces the computational load from the processor and thereby improves the performance of the access control system as described in Section 2.

To improve the efficiency in terms of data rate we utilize multiplier less technique along with the distributed arithmetic (DA) technique.

In FPGA based system, look-up tables (LUTs), random access memories (RAMs) and readonly memories (ROMs) are used to compute and store the intermediate values of the coefficient.

The hardware architecture of 2D DCT/IDCT is optimized in terms of resource utilization, speed and power consumption [166].





(a)



(b)

#### Figure 4.5: (a) RTL view of 2D-DCT architecture along with the detailed circuit; (b) RTL view of 2D-IDCT architecture and the detailed circuit diagram

The FPGA synthesized RTL view of 2D DCT and IDCT modules are depicted in Figure 4.5. The DCT block has 8 data input lines and each line of them is 8-bit wide.

Whereas the module has 8 data output lines and each line of them is 12-bit wide. Eight coefficients (i.e. 64-bits) of each row are shifted into the register during the first clock cycle. Then 1-D DCT is computed and the output is transferred to a serial in parallel out transpose buffer. The computed coefficients are rearranged in ascending order and are stored in the transpose buffer.

After, each eight executions cycle, the transpose buffer is filled with 8-rows of DCT coefficients. The coefficients are then transferred in parallel and column wise to the next 1-D DCT module, to perform 2-D DCT. In order to increase the processing speed of the DCT/IDCT module, the scheme utilizes the structure, proposed by [166]. The implemented DCT/IDCT modules can complete the processing of (8×8) image block in 65 clock cycle.

#### 4.3.2.4 Variance Calculation Block:

Block (8×8) wise variance is calculated, to select different step sizes ( $\Delta$ ) for the watermarking process. The top-level RTL view of FPGA based variance calculator, state diagram and simulated output are shown in Figure 4.6. Based on the 'trigger\_en =1' and 'reset =0', the variance.





(c)

Figure 4.6: Variance calculation: (a) RTL view; (b) State diagram; (c) Simulated output

Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control......

Calculation block starts the reading of 64 pixels from the buffer memory and the mean value of pixels of a block is calculated. In the next state, the variance is calculated using mean and pixel values. The calculated variance is then validated by the active high 'variance\_valid' signal.

# 4.3.2.5 Dither Generation and Watermark Permutation Block:

This module performs a vital role to generate the dither sequence and also to calculate the permuted watermark. The 'Dither Generation' is the main sub-module of the module.

The RTL view of 'Delta Generation' and 'Dither Generation' units are shown in Figure 4.7(a) and (b),



Figure 4.7: (a) RTL view of 'Delta Generation' unit; (b) RTL view of 'Dither Generation' unit; (c) Control signal description of (a) and (b)

Respectively. The 'Delta Generation' unit selects a predefined step-size i.e. delta ( $\Delta$ ) from the set (10, 12, 14, 16), based on the calculated variance value. It is to be noted that the scheme selects large step-size for low variance block as described in Section 4.2.1. The 'Dither Generation' unit generates four orthogonal sequences of 'D0' and 'D1' based on step-size ( $\Delta$ ). The 'Dither Generation' unit contains two separate ROMs. The pseudo-random sequence (K) is stored in ROM-1 that is utilized to generate dither. The permuted watermark is generated from the given watermark that is stored in ROM-2. The calculated dither and the permuted watermark 'W' are then transferred to the 'Watermark Embedding' block to embed a watermark into the DCT coefficients.



# Figure 4.8: (a) RTL view of 'Watermark Embedding' block; (b) Control signal description of (a)

#### 4.3.2.6 Watermark Embedding Block:

The watermark embedding block is responsible for the modification of the incoming DCT coefficient from the block buffer memory.

The detailed embedding process is described in Algorithm 4.3. The RTL views of the watermark embedding block along with the port details are shown in Figure. 4.8(a) and (b), respectively. Watermark embedding is accomplished by,

| Algorithm 4.3: Watermark Bit Embedding                                  |
|-------------------------------------------------------------------------|
| Start                                                                   |
| If $W' = 0$ then D0 value is selected;                                  |
| $X=[Q (incoming DCT coefficient + (2.D_0))-D_0];$                       |
| End If                                                                  |
| If $W' = 1$ then D1 value is selected;                                  |
| X=[Q (incoming DCT coefficient - (2.D <sub>1</sub> ))-D <sub>1</sub> ]; |
| End If                                                                  |
| End                                                                     |



Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control......

\*M= Multiplier, S= Summer, SB= Subtract, B= Buffer

(a)

(b)

# Figure 4.9: (a) Coefficient modulation unit (CMU); (b) Parallel watermarking block organization

Parallel processing based on the dithers i.e. (D0, D1) and permuted watermark (W'). The coefficient modulation unit (MU) modulates the DCT coefficients (Xq) to embed watermark (W') using the selected dither sequences i.e. D0 and D1.

The modified coefficients are then transferred to the buffer memory via data-bus. The latency of the embedding block is improved by the parallel operation of eight MU.

It is seen that the parallel operation of MU takes only 8 clock cycle to process 64 coefficients.

Figure 4.9(a) and (b) show the coefficient modulation unit (MU) and the parallel implementation of the embedding process, respectively.



Figure 4.10: State machine: watermark encoder

#### 4.3.2.7 Control Unit Block:

Control unit provides appropriate timing sequences to perform the encoding operation.

The Finite State Machines (FSM) based control unit controls the overall process by providing predetermined timing and control signal. The FSM control has eight different steps from ST0 to ST7 is shown in Figure 4.10. The active high 'trigger\_en' signal initiates the watermark embedding process. The 'ST1' state transfers first (8×8) image pixel block from the 'Image RAM' to 'Block Buffer'. The second state 'ST2' calculates the variance of the block and stores it temporarily. Next two states 'ST3' and 'ST4' computes block based 8-point DCT. Depending on the variance value, the 'ST5' state generates the dither and embeds the watermark into the DCT coefficients and writes it back to the same (8×8) block buffer.

The state 'ST6' and 'ST7' computes IDCT and then the watermarked DCT coefficients are then transformed into the watermarked pixel. An  $(8\times8)$  buffer is used as a temporary storage for the DCT coefficient and the IDCT values. The IDCT output stage also verifies the existence of watermark for all the image blocks. Lastly, the condition is checked for all the  $(8\times8)$  pixel blocks are watermarked. If the condition is satisfied then the programme control (PC) moves to 'ST0' (i.e. idle) state.

# 4.3.3 Watermark Decoder Architecture:

The hardware architecture for watermark decoding unit is the exact replica of watermark encoding unit except for the watermark extraction block. Depending on the decoding algorithm, block-wise image pixels are read from 'Image RAM', the variance is calculated to choose the step size.

Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control......

The dither sequences ('D0' and 'D1') are generated from the dither generator block by the calculated step size.

Watermark extraction block accepts the different dither sequences and calculates the minimum distance to predict the existence of watermark bits i.e. '0' or '1'.

The data path of the watermark extraction block is shown in Figure 4.11.

| Algorithm 4.4: Watermark Bit Extraction                        |
|----------------------------------------------------------------|
| Start                                                          |
| If Block buffer location $\neq$ "000000" then                  |
| A= [Block buffer] - [(Y  [[Block buffer] + $(D0)$ ]/Y ) - D0]; |
| B= [Block buffer] - [(Y  [[Block buffer] + (D1)]/Y ) - D1];    |
| If A≥B then                                                    |
| Watermark = 1;                                                 |
| Else                                                           |
| Watermark = 0;                                                 |
| End if;                                                        |
| End if;                                                        |
| End                                                            |



\*CU= Control Unit, VC=Variance Calculation, WE=Watermark Extraction, SN= Self Noise

Figure 4.11: Decoder data-path



#### Figure 4.12: (a) RTL view of 'Watermark Extraction and Self-Noise Suppression' block; (b) Control signal descriptions of (a)

Detail RTL view of watermark extraction and self-noise suppression along with its port details are shown in Figure 4.12(a) and (b), respectively.

Depending on the detected watermark bit, the self-noise (in that block) is suppressed from the DCT coefficients in a parallel fashion. The DCT coefficients are transferred back into the  $(8 \times 8)$  block buffer via a data bus, as shown in Figure 4.13.

The implementation of parallel decoding architecture improves the latency up to 87.5% than the serial decoding architectural implementation. Then IDCT is performed on the restored coefficient of the block buffer to obtain higher quality image pixel.
Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control......

On the other hand, the predicted watermark bits are reversely permuted to get back the original watermark and are temporarily buffered in the watermark extraction block. After the completion of the transformation process of decoded DCT coefficients to the block buffer memory, the buffered watermark data are transferred to block buffer memory.



Figure 4.13: (a) Coefficient demodulation unit (CDU); (b) Parallel watermark decoding block organization

The prototype implementation of the decoder is not providing the readout facility to the extracted watermark bits.

However one can fetch them by interfacing extra input-output (I/O) ports.

### 4.3.3.1 Control Unit Block:

Watermark decoding and self-noise removal process govern by the eight ('ST0' to 'ST7') distinct states of an FSM control machine.

The state machine is shown in Figure 4.14. From the figure, it is cleared that the extraction of a watermark bit follows the same steps.

i.e. 'ST0' to 'ST4' of watermark embedding. State 'ST5' estimates whether '0' or '1' is embedded into the DCT coefficient by computing minimum distance as described in decoding algorithm.

Last two states 'ST6' and 'ST7' perform the IDCT to get back the decoded image pixels.



Figure 4.14: State machine: watermark decoder

#### 4.4 Performance Evolution:

The performance evolution of the access control hardware is evaluated over a large number of benchmark images.

The hardware simulation results are explained in this section. The experiment is conducted using Intel(R) Core(TM) i3-4005U CPU, 1.7 GHz processor, 8 GB RAM, 64 bit OS using MATLAB(R2014a) and Xilinx tools (ISE Design Suite 14.5).

We have chosen Xilinx Virtex 7 (XC7VX330T-FFG1157) series FPGA to implement the design.

Firstly, the FPGA based prototype hardware design is implemented and tested for serial architecture and then the performance of the prototype design is improved by incorporating parallel architecture.

The hardware is evaluated for  $(512 \times 512)$  sized grayscale images. We have used a binary watermark is of size  $(64 \times 64)$ .

Side by side, the host image has  $(64 \times 64)$  number of non- overlapping blocks for a  $(512 \times 512)$  image.

The design may be extended for large image size, say (N×N) for real-time application.

Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control ......

Table 4.1: Hardware description language (HDL) synthesis summary of encoder and<br/>decoder in terms of resource utilization (Sl. No. 1-7), timing report (Sl. No. 8-10) and<br/>throughput (Sl. No. 11).

| Sr.<br>No |                            | Encoder<br>(Serial) | Decoder<br>(Serial) | Encoder<br>(Parallel) | Decoder<br>(Parallel) | Encoder<br>(Serial) | Decoder<br>(Serial) | Decoder<br>(Parallel) |           |
|-----------|----------------------------|---------------------|---------------------|-----------------------|-----------------------|---------------------|---------------------|-----------------------|-----------|
|           | Target<br>board<br>details |                     |                     | Vir                   | tex 7 (XC7V           | X330T-FF0           | 51157-3)            |                       |           |
|           | Image<br>Size              |                     | (2                  | 56×256)               |                       |                     | (512                | ×512)                 |           |
| 1         | А                          | 789                 | 3201                | 849                   | 3441                  | 869                 | 4897                | 988                   | 5201      |
| 2         | В                          | 415                 | 1196                | 458                   | 1257                  | 416                 | 1273                | 474                   | 1396      |
| 3         | С                          | 161                 | 574                 | 161                   | 582                   | 128                 | 475                 | 161                   | 664       |
| 4         | D                          | 62                  | 125                 | 62                    | 132                   | 46                  | 115                 | 62                    | 140       |
| 5         | Е                          | 1                   | 1                   | 4                     | 4                     | 1                   | 1                   | 4                     | 4         |
| 6         | F                          | 1                   | 1                   | 4                     | 4                     | 1                   | 1                   | 8                     | 8         |
| 7         | G                          | 16                  | 16                  | 16                    | 16                    | 22                  | 22                  | 22                    | 22        |
| 8         | Н                          | 131.167             | 131.167             | 131.07                | 131.11                | 119.431             | 119.275             | 131.167               | 131.167   |
| 9         | Ι                          | N.A                 | N.A                 | 0.00009803            | 0.00009814            | 0.0003863           | 0.00038687          | 0.00019467            | 0.0001946 |
| 10        | J                          | N.A                 | N.A                 | 12850                 | 12868                 | 46140               | 46145               | 25534                 | 25537     |
| 11        | K                          | N.A                 | N.A                 | 0.66846               | 0.66773               | 0.678               | 0.677               | 1.34                  | 1.34      |

**Note: A**= Number of slice LUT, **B**= Number of slice registers, **C**= Number of fully used LUTs, **D**= Bonded IOBs, **E**= Number of BUFG, **F**= Number of block RAM, **G**= Number of DSP 48E1s, **H**= Maximum frequency (MHz), **I**= Hardware execution time (NC) (second), **J**= Total clock required, **K**= Throughput (Gigabyte/s).

The VLSI circuit of the access control scheme is characterized in terms of component requirement, power consumption, and throughput. The basic architecture of reconfigurable FPGA is discussed in Chapter 2.

The design cost of the architecture is evaluated in terms of the basic entities of field programmable devices like configurable logic blocks (CLB), look-up tables (LUT), multiplexers (MUX), programmable switching matrix, flip-flops, inbuilt block RAMs (BRAMS), input/output (I/O) ports; block buffers (BUFG) etc.

The logic utilization of the serial and parallel implementation of the scheme for different image sizes (i.e.  $256 \times 256$  and  $512 \times 512$ ) is depicted in Table 4.1.

It is found that the parallel architectural implementation is more resource hungry than serial architectural implementation but at the same time takes less time to process the bitstream.

We have calculated the resource implementation factor for the implementation of  $(512 \times 512)$  sized image.

The parallel architecture requires 15.91% and 9.59% more resources for encoder and decoder, respectively than the serial implementation.

At the same time, the parallel encoder and decoder architecture save 0.19166 ms and 0.192185 ms of the processing time, respectively than the serial implementation.

The parallel processing takes only 25534 and 25537 clock cycle to encode and decode watermark, respectively with an average execution frequency of 131.16MHz. The throughput of the embedding and extraction unit is calculated by equation (2.17) as mentioned in Chapter 2.

The throughput of access control hardware is depicted in 11<sup>th</sup> row of Table 4.1. The FPGA based systems consume two different types of power such as static power and dynamic power.

The static power is consumed by the total number of resource utilized in the system [206], whereas the dynamic power consumption (Pd) depends on the factors like effective capacitance (CL), input clock frequency (f) and voltage swing (V) of the VLSI circuit.

The dynamic power consumption can be expressed by equation (2.15) as mentioned in Chapter 2. The total power consumption is estimated by Xilinx X-power analyzer (XPA) tool and is summarized in Table 4.2.

It is seen that the parallel implementation of the scheme requires less clock cycle to compute the encoding and decoding operation. Hence the power consumption of parallel implementation is found to be minimum than the serial implementation.

|           | -     | Power  | •     | Power<br>Consumption |                |       |                  | Power  | •     | Power            |      |       |  |  |
|-----------|-------|--------|-------|----------------------|----------------|-------|------------------|--------|-------|------------------|------|-------|--|--|
| Image     | Cor   | sump   | tion  |                      |                |       | Cor              | sump   | tion  | Consumption      |      |       |  |  |
| Size      | Sur   | nmar   | y of  | Sur                  | nmar           | y of  | Su               | nmar   | y of  | Summary of       |      |       |  |  |
|           | Seria | al Enc | oder  | Seria                | Serial Decoder |       |                  | lel En | coder | Parallel Decoder |      |       |  |  |
|           | S     | D      | Т     | S                    | D              | Т     | S                | D      | Т     | S                | D    | Т     |  |  |
| (256×256) | N.A   | N.A    | 58.45 | N.A                  | N.A            | 58.57 | 55.86 1.01 56.87 |        |       | 54.44            | 1.95 | 56.39 |  |  |
| (512×512) | 75.28 | 4.01   | 79.19 | 75.71 4.12 79.8      |                |       | 76.64 1.81 78.45 |        |       | 76.52            | 2.05 | 78.57 |  |  |

 Table 4.2: Synthesized power consumption report.

Note: S: Static Power (in mW), D: Dynamic Power (in mW), T: Total Power (in mW).

Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control......

|                       | Image<br>Size | Working<br>Domain |                                                 | Occupied<br>Slice | Slice<br>Registers | Slice<br>LUTs | BRAMs | Total<br>Power<br>(mW) | Clock<br>(MHz) |
|-----------------------|---------------|-------------------|-------------------------------------------------|-------------------|--------------------|---------------|-------|------------------------|----------------|
| Maity                 |               |                   | Embedding                                       | 9881              | 9347               | 11291         | 3     |                        |                |
| and<br>Maity,<br>2014 | (32×32)       | Spatial           | Decoding                                        | 14600             | 12531              | 28753         | 3     | 750                    | 98.7           |
| Darji et<br>al., 2013 | (512×512)     | DWT<br>(Haar)     | Embedding<br>(Pipelined<br>Implementati-<br>on) | 1729              | 900                | 3153          | 305   | 103                    | 07.00          |
|                       |               |                   | Embedding<br>(Parallel<br>Implementati-<br>on)  | 2002              | 1027               | 3453          | 305   | 99                     | 87.82          |
| Das et                | (512×512)     | Difference        | Encoder                                         | NA                | 20                 | 90            | NA    | 100                    | 176.24         |
| al., 2018             | (012/(012)    | Expansion<br>(DE) | Decoder                                         | NA                | 20                 | 90            | NA    | 200                    | 173.93         |
| Proposed              | (510, 510)    | DOT               | Embedding<br>(Parallel<br>Implementati-<br>on)  | 988               | 474                | 1301          | 8     | 50 51                  | 121.16         |
| Scheme                | (512×512)     | DUT               | Decoding<br>(Parallel<br>Implementat-<br>ion)   | 5201              | 1396               | 5933          | 8     | /8.51                  | 131.16         |

# Table 4.3: Comparison of results in terms of resource utilization and power consumption

According to this analysis, the proposed watermark embedding unit for  $(512 \times 512)$  sized image requires 79.19mW and 78.45 mW, power for the serial and parallel implementation, respectively.

On the other hand, the watermark extraction unit requires 79.83mW and 78.57 mW, of power, for the serial and parallel implementation, respectively. The depicted Table 4.3 compares the overall performances related to the works [76, 79].

The parallel architectural implementation consumes only 78.51 mW of average power. The scheme saves 90% power than the scheme implemented in [76].

On the other hand, the proposed scheme saves 23.77 % and 20.69% power for the embedding process than the pipeline and parallel implementation [81], respectively.

It is clear from the Table 4.3 that, the implemented hardware architecture is better than the related work in terms of resource utilizing factor.

The design of an encoder utilizes about 90% fewer FPGA resources than [76] and 44% fewer resources than [79].

| Table 4.4: Comparison of results in terms of throughput and data embedding rate |
|---------------------------------------------------------------------------------|
| ( <b>DER</b> ).                                                                 |

| Scheme                | Design Type | Throughput        | DER  |
|-----------------------|-------------|-------------------|------|
| Maity and Kundu [71]  | FPGA        | 4.706 Megabits/s  | 0.25 |
| Maity and Maity, [76] | FPGA        | 1.0395 Megabits/s | 0.46 |
| Das et al. [81]       | FPGA        | 35.284 Megabits/s | NA   |
| Proposed Scheme       | FPGA        | 1.34 Gigabyte/s   | 0.02 |



Figure 4.15: Frequency vs. Power and Area (in terms of resource utilization) vs. Frequency trade-off curve

The above discussion shows that the proposed prototype implementation is superior in terms of logic utilization, power consumption, throughput and operating frequency than the related works found in the literature. A selective portion of the Xilinx ISE simulation results is shown in Figure 4.16. (a-c) shows the simulation results of the access control system.

Figure 4.16(a) shows the step size ( $\Delta$ ), dither sequence 'D0' and 'D1', binary watermark, and the permuted watermark.

Figure 4.16(b) depicted the block-based pixel representation and modulation of DCT coefficients. The extracted image pixel of an  $(8\times8)$  image block is as shown in Figure 4.16(c).

|               |    |           |                    |                   | 24,1 | 50.667 ns                               |             |
|---------------|----|-----------|--------------------|-------------------|------|-----------------------------------------|-------------|
| Name          | V  |           | 23,800 ns          | 24,000 ns         |      | 24,200 ns                               | 24,400 ns   |
| 🗓 clk         | 1  | TUUU      |                    | huuuuu            | T    |                                         | mm          |
| 🔓 reset       | 0  |           |                    |                   |      |                                         |             |
| l 🔓 delta     | 16 |           | 10                 |                   |      | 16                                      |             |
| 🕨 📲 D0        | [- | [-65,35,- | 55,-35,35,15,-35,  | 5,15,-65,-15,     | [-10 | 4,56,-88,-56,56,24,                     | -56,8,24,-1 |
| 🕨 📲 D1        | [- | [-60,30,- | 50,-30,30,10,-30,0 | ),10,-60,-10,     | [-96 | 48,-80,-48,48,16,-4                     | 8,0,16,-96, |
| 🕨 📲 WATERMARK | 11 | 11111     | 111111111111111111 | 11111111111111111 | 1111 | 111111111111111111111111111111111111111 | 1111111111  |
| BINARY_SEQ    | 10 | 10110     | 010011001011011    | 00100110010110    | 1100 | 10011001011011001                       | 001100101   |
| PERMUTED_WAT  | 01 | 01001     | 101100110100100    | 11011001101001    | 0011 | 01100110100100110                       | 110011010   |

Efficient Hardware Implementation of Data Hiding Scheme for Quality Access Control......

| 1  | ``` |
|----|-----|
| 1  | 9 I |
| ۰. | aı  |
|    | ~ / |
|    |     |

| Name                 |     | 21,700 ns | 21,800 ns    | 21,900 ns | 22,000 ns             | 22,100 ns 22, |
|----------------------|-----|-----------|--------------|-----------|-----------------------|---------------|
| ۱ <mark>۲</mark> clk |     |           | mmmmm        |           |                       |               |
| ါြ xin               |     |           |              |           |                       |               |
| 🕨 📑 xin              | 129 | 132       | 134          | 132       |                       |               |
| 🕨 📑 xin              | 131 | 135       | 134          |           |                       |               |
| 🕨 📑 xin              | 131 | 132       | 133          | 134       |                       |               |
| 🕨 📑 xin              | 132 | 136       | 137          | 132       | (8x8) Imag            | e Pixel Block |
| 🕨 📑 xin              | 132 | 134       |              |           |                       |               |
| 🕨 📑 xin              | 134 | 132       | 136          | 134       |                       |               |
| 🕨 📑 xin              | 134 | 133       | 136          | 138       |                       |               |
| 🕨 📑 xin              | 132 | 133       | 132          |           |                       |               |
| Ղ🔓 xin               |     |           |              |           |                       |               |
| 🕨 📑 χοι              |     |           | -6           |           | 1066 -2               | -5 -2 -1      |
| 🕨 📑 xοι              |     |           | A            |           |                       | -3 -6 4       |
| 🕨 📑 χοι              |     | DCT C     | Coefficients |           | <u>-2</u> <u>-1</u> X | 2 3 -2        |
| 🕨 📑 xοι              |     |           |              |           | 2 -1 /                | -3 -2 -1      |
| 🕨 📑 xοι              |     |           | -2           |           | 1                     |               |
| 🕨 📑 χοι              |     |           | 2            |           |                       | -1            |

(b)

| Name    | Val | <br>22,100 | ns       | 22,200 ns |        | 22,3 | 00 ns |    | 22,40 | 0 ns   |     |     | 22, | 500 n | s   |   | 22,6 | 00 ns |
|---------|-----|------------|----------|-----------|--------|------|-------|----|-------|--------|-----|-----|-----|-------|-----|---|------|-------|
| ll xo   | 0   |            |          |           |        |      |       | Γ  |       | $\Box$ |     |     |     |       |     |   |      |       |
| 🕨 📑 xot | 126 |            |          | 126       |        |      |       |    |       | 130    | X   | 128 | 1   | .29   | 128 | Х | 130  | 128   |
| 🕨 📑 xo  | 130 |            | 13       | b         |        |      |       | C  | 36    |        | 135 | i   | X 1 | .36   | 135 | Х | 133  | 134   |
| 🕨 📑 xo  | 130 |            | (8x8) bl | lock of   |        |      |       | Ί  |       |        |     | 131 |     |       |     | Х | 1    | 33    |
| 🕨 📑 xo  | 132 |            | waterm   | ark ext   | racted | 1    |       |    | 13    | 6      | Х   | 135 | X   | 13    | 6   | Х | 137  | 131   |
| 🕨 📑 xo  | 132 |            | image n  | ivel      |        |      |       | '  | 33 🔾  |        |     | 134 |     |       | 133 | Х | 134  | 133   |
| 🕨 📑 xo  | 133 |            | iniage p |           |        | j    |       | K. | 32 )  |        | 131 | ļ   | X 1 | .32   | 131 | Х | 136  | 132   |
| 🕨 📑 xo  | 134 |            | 13       | 4         |        |      |       | C  | 32 )  | 133    | Х   |     | 1   | 32    |     | Х | 136  | 138   |
| 🕨 📑 xo  | 131 |            | 13       | 1         |        |      |       | C  |       |        |     | 133 |     |       |     | Х | 132  | X     |

(c)

Figure 4.16: Hardware description language (HDL) simulation results: (a) Step size 'Delta', Dither (D0 & D1) sequence and Watermark; (b) Block of image pixels(8×8) and modified DCT coefficients after watermark embedding; (c) Block (8×8) wise image pixels after self-noise supression

## 4.5 Chapter Summary:

This Chapter, an efficient parallel hardware implementation is described to implement a data hiding based quality access control of image in DCT domain. The FPGA based hardware is implemented for quality access control of 8-bit grayscale image.

The implementation of  $(512 \times 512)$  sized image result shows that the design of encoding system requires only 988 slices and 161 fully used LUTs, having 78.45 mW power consumption.

Whereas, decoder implementation utilizes 5212 slices and 664 fully used LUTs having power consumption of 78.57 mW. The parallel implementation of the encoding and decoding system requires only 25534 and 25537 clock cycle, respectively with the maximum operating frequency of 131.16 MHz. The last two reported work are in DCT domain. Next chapter we have discussed about the implementation of a DWT-lifting based data hiding method for quality access control. The scheme is implemented by opting the low power design strategy to minimize the power consumption of the hardware system.

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

Chapter 5

# FPGA Implementation of Lifting Based Data Hiding Scheme for Efficient Quality Access Control of Images

### 5.1 Introduction:

The term World Wide Web (WWW) is introduced because of the revolution in mass communication and information technology. Because of the advancement in worldwide technology, digital information (image, documents and multimedia contents) can be processed and shared from one resource to endless destination. Digital multimedia data can be received remotely. It also can access and replicated without any quality loss.

These abilities of accessing of digital multimedia content may cause problems in commercial domain where the new innovative creation in the WWW is vital for commercial benefits. To satisfy this need, vendors may confine the common users from accessing the superior quality of the multimedia signal (image). On the other hand, the genuine user who has agreement of subscription can access the signal with superior quality. An 'access control' scheme may be significant for those commercial vendors and original inventors. In the last decade, Researchers dedicated enormous attempts for the growth of access control systems. The related literature of this work is previously described in Section 1.2.4 of Chapter 1.

It's quite clear that for low power, real-time performance, high reliability, low-cost applications, and also for easy to integrate with existing consumer electronic devices the watermarking chip is essential. Only software schemes are not enough for these purposes. The design of a watermarking chip involves trade-offs of performance, power consumption, silicon area, memory requirement, and integrability. This demands the adaption in architectural design and efficient hardware description language (HDL) coding to meet the objectives stated previous. Though the techniques like DWT, dither quantization, and chaotic permutation are not new but our method has some advantages in comparison with other available methods like (1) minimal resource utilization, (2) very low power and (3) high throughput. According to this discussion the presented scheme is an improved data hiding method.

In this chapter, implementation of an FPGA based low power hardware architecture of data hiding method for proficient quality 'access control' of gray scale images using lifting based DWT is demonstrated. FPGA based platform is selected to utilize the superiority of low investment cost, fast and easy system prototyping. Furthermore, the design can be reconfigurable. According to our knowledge, the low power implementations of lifting based quality access control scheme of the grayscale image in hardware are not available in the literature to date. In this regard, the proposed work can be treated as a novel one.

The main objective is implementation of the VLSI architecture for realizing the access control encoder and decoder system which have low power and high throughput with minimum resource utilization. To attain the target, the power-aware design style [184, 206, and 207] is used.

The major characteristic of this architecture is the low power design using advanced VLSI techniques, such as use of embedded digital signal processing(DSP) block, disabling of logic block(when not in operation), synchronous control, clock gating, resource sharing avoidance of large comparator and also the use of design constraints during synthesis. While operating at 130.14 MHz, the average power consumption of the chip is estimated to be 78.48 mW. Moreover, when the algorithms need original host image during the detection process, the on-chip memory constraint becomes an issue. With increasing memory requirement, the chip cost increases, and also there is an increase in power consumption. The proposed method is memory efficient as for extraction purposes the original host image values are not required.

The main contributions of the proposed implementation are summarized as follows:

*Minimal Resource Utilization:* The proposed architecture of prototype encoder and decoder is synthesized and implemented for (512×512) sized image using advanced Xilinx synthesis technology (XST) tool to achieve minimal resource utilization. Host image (512×512) encoder requires only 476 number of Slice LUT (out of which only 157 are fully used), 230 number of slice registers and 12 bonded input-output buffers (IOBs). Whereas, the decoder design requires only 456 number of slice LUT (out of which only 162 are fully used), 227 number of Slice Registers and 12 bonded IOBs.

*Very Low Power Consumption:* According to the Xilinx X-power analyzer the power requirement of the proposed encoder and the decoder is only 78.52mW and 78.45mW respectively for an image size of (512×512).

*Higher Embedding Rate:* While operating in 130.094MHz, 130.191MHz frequency, the proposed access control encoder and decoder provide the throughput of 23.819 Megabyte/s, 23.835Megabyte/s respectively.

The rest of the Chapter is organized as follows: Section 5.2 describes the access control scheme, while in Section 5.3 the VLSI architecture of the access control scheme is proposed. In Section 5.4, the performance evaluation of the proposed system is discussed. Section 5.5 depicts the chapter summary.

### **5.2 Access Control Scheme:**

In the current section, we explain the algorithm of data-hiding based quality access control scheme [202] chosen for the VLSI implementation.

The function of watermarking is insertion of a watermark into the host image to provide quality access control in encoding side and in the receiver side, embedded watermark is extracted to obtain host image with improved quality.

| Algorithm 5.1: Image encoding process.                                                                               |
|----------------------------------------------------------------------------------------------------------------------|
| <b>Inputs:</b> Watermark (W), Key (K), Image Pixel (P <sub>x</sub> ).                                                |
| 1. Calculate Permuted Watermark ( $W' = W \oplus K$ ).                                                               |
| 2. Calculate lifting based n-level 2-D DWT on image pixel (P <sub>x</sub> )                                          |
| 3. Select coefficient(C) from LL, HL, LH, HH <sub>3</sub> , HH <sub>2</sub> , HH <sub>1</sub> sub-bands.             |
| 4. Coefficients (C) are divided into 'g' number of different categories i.e. $C = (C_1, C_2, C_3 \dots \dots C_g)$ . |
| 5. Steps Size ( $\Delta$ ) Selection for dither modulation.                                                          |
| 6. Binary Dither sequence $d_{g,q}(0)$ , and $d_{g,q}(1)$ are calculated.                                            |
| 7. Modulate coefficient by inserting (W').                                                                           |
| 8. Compute 2D-IDWT.                                                                                                  |

Outputs: Watermarked image pixel.

# 5.2.1 Watermark Encoding:

The process of encoding is described by Algorithm 5.1. The image encoding is shown in Figure 5.1(a). In this algorithm, a binary ( $n \times n$ ) sized watermark (W) as shown Figure 5.1(b) is permuted by XOR operation using a 2-D random key (K). Figure 5.1(c) shows the output (W') after XOR operation.

Then by 3-level lifting based 2D-DWT the host image  $(m \times m)$  is transformed. Then a group of coefficients is selected from the different sub-bands i.e. low-low (LL), high-low (HL), low-high (LH), and high-high (HH), in such a way that a total 24 number of coefficients are selected for the embedding of the 1-bit watermark. Out of all the 24 coefficients, four coefficients are taken from LL, HL3, LH3, and HH3 each, the other four coefficients are taken from HH2 and the residual 16 coefficients are selected from HH1 sub-band.





Figure 5.1: (a) Block diagram of watermark encoding process: Encoder; (b) Binary watermark; (c) Transmuted watermark

Then the elected coefficients are grouped into 'g' number of categories (g = 5 in our design). Hence, 'g' for different step sizes ( $\Delta S$ ) are selected to determine the binary dither sequences and smallest step-size is employed for the modulation of LL sub-band, because most visual information of the image enclosed in it.

The largest step size ( $\Delta L$ ) is used for HH sub-band. Correspondingly, to adapt the other subband, the step sizes are selected in between ( $\Delta L$ ) to ( $\Delta S$ ). Depending on the step size ( $\Delta S$ ) value, the dither sequences are generated for QIM and it is given by:

$$d_{g,q}(0) = \left\{ \Re(key) \times \Delta_g \right\} - \Delta_g/2, \ 0 \le q \le L - 1$$
(5.1)

And

$$d_{g,q}(1) = \begin{cases} d_{g,q}(0) + \Delta_g/2 & \text{if } d_{g,q}(0) < 0\\ d_{g,q}(0) - \Delta_g/2 & \text{if } d_{g,q}(0) \ge 0' \end{cases}$$
(5.2)

Where,  $\Re(\text{key})$  is a random number generator and the real-valued dither sequences  $d_{g,q}(0)$  and  $d_{g,q}(1)$  are spitted by the distance of  $\Delta_g/2$ ? The symbol 'L' denotes the length of the dither sequences. Permuted watermark (W') is then embedded into the different subbands of DWT coefficients and is given by: FPGA Implementation of Lifting Based Data Hiding Scheme for Efficient.....

$$Y_q = \begin{cases} Q\{x_q - k' \times d_{g,q}(0), \Delta_g\} + d_{g,q}(0) & \text{if } w'(i,j) = 0\\ Q\{x_q + k' \times d_{g,q}(1), \Delta_g\} - d_{g,q}(1) & \text{if } w'(i,j) = 1 \end{cases}$$
(5.3)

Where the q-th DWT coefficient is  $x_q$ , the symbol 'Q' is the uniform quantizer,  $\Delta_g$  is the step size for the category 'g'.

The symbol 'k'' indicate the quality deprivation factor. Inverse DWT (IDWT) is then performed on watermarked coefficients to regenerate the watermarked image.

#### 5.2.2 Watermark Decoding:

The decoding process is the reverse process of encoding and is shown in Figure 5.2. The decoding process is explained in Algorithm 5.2.



Figure 5.2 (a) Decoder block diagram; (b) Decoded watermark

The basic principle of the decoder is based on minimum distance decoding. The distance between the dither sequences, dg, q (0) and dg, q (1) is calculated by Eq. (5.4) and (5.5). Permuted watermark bit  $\tilde{W}(i, j)$  is decoded by Eq. (5.6).

Algorithm 5.2: Image decoding process.

**Inputs:** Watermarked image pixel, Steps Size ( $\Delta$ ), Key (K).

1. Calculate lifting based n-level 2-D DWT on watermarked image pixel.

2. Select coefficient(C) from LL, HL, LH, HH<sub>3</sub>, HH<sub>2</sub>, HH<sub>1</sub> sub-band.

3. Coefficients (C) are divided into 'g' number of different categories i.e.  $C = (C_1, C_2, C_3 \dots \dots C_g)$ .

4. Binary Dither sequence  $d_{g,q}(0)$ , and  $d_{g,q}(1)$  are calculated.

5. Watermark Bit Extraction.

6. Noise Cancellation for Access Control.

7. Decoding of Watermark Bit.

Outputs: Watermarked image pixel with improved quality.

The large distance between S<sub>A</sub> and S<sub>B</sub> signifies the lower probability error of decoding.

This property confirms the better quality of access control for the digital image.

$$S_{A} = \sum_{q=0}^{S-1} (|Q(Z_{q} - d_{gq}(0), \Delta_{g}) + d_{gq}(0) - Z_{q}|),$$
(5.4)

$$S_{\rm B} = \sum_{q=0}^{S-1} \left( \left| Q \left( Z_{\rm q} + d_{\rm gq}(1), \Delta_{\rm g} \right) - d_{\rm gq}(1) - Z_{\rm q} \right| \right), \tag{5.5}$$

$$\widetilde{W}(i,j) = \begin{cases} 0 & \text{if } S_A < S_B \\ 1 & \text{otherwise'} \end{cases}$$
(5.6)

Where  $Z_q$  is the *q*-th DWT coefficient. Then,  $\widetilde{W}(i, j)$  is XORed with 'K', to get back the decoded version of watermark ( $\widehat{W}$ ). The decoded watermark is shown in Figure <u>5.2(b)</u>.

The self-noise is then removed by Eq. (5.7).

$$Z'_{q} = \begin{cases} Z_{q} + (k' - 1) \times d_{g,q}(0) & \text{if } \widetilde{W}(i,j) = 0\\ Z_{q} - (k' - 1) \times d_{g,q}(1) & \text{if } \widetilde{W}(i,j) = 1 \end{cases}$$
(5.7)

For this design, we have considered five different step sizes i.e. ( $\Delta 1=11$ ,  $\Delta 2=12$ ,  $\Delta 3=13$ ,  $\Delta 4=14$ ,  $\Delta 5=15$ ) and they are selected for the different categories of coefficients.

# 5.3 Proposed VLSI Architecture of Access Control Scheme [171]:

In this part, we explain the proposed VLSI data path of DWT (lifting) based quality access control algorithm as described in Section 5.2. The increasing demand for low-power portable communication system requires new low power technique, especially for FPGA.

Although FPGA based design dissipates more power than the fixed logic but as it is cheap and reconfigurable. So it is still a good option for the designer. The planned hardware design follows both international technology roadmap for semiconductors (ITRS) and the poweraware hardware description language (HDL) [182, 208, and 209] technique to achieve low power consumption.

The data paths of proposed access control encoder and decoder are designed using very high speed integrated circuit hardware description language (VHDL) and implemented in XILINX Zynq (XC7Z020-CLG484-1) FPGA. The architecture's different modules are configured separately. The encoder and decoder architecture are constructed in a symmetric manner. Proposed encoder module contains different units like 'Image\_RAM', lifting based DWT/IDW i.e. 'Image DWT/ Image IDWT', 'Dither Generation and Watermark Permutation', 'Embed\_block', and a finite-state machine (FSM) based 'Control Unit'. The decoder module also contains 'Image\_RAM', lifting based DWT/IDWT i.e. 'Image DWT/ Image IDWT', 'Dither Generation', 'Dither Generation', 'Embed\_block', and a finite-state machine (FSM) based 'Control Unit'.

Each and every different hardware components are designed, tested, and optimized independently, before integrating mutually to perform encoding and decoding operations. During this implementation architecture it should be confirmed that the individual components of the system would be enabled only when those components need to be performed; otherwise, those components must remain disabled [184].

# 5.3.1 Watermark Encoder Datapath:

For the testing of the implemented prototype the encoding algorithm is applied to different host images of size (512×512). Figure 5.3 describes watermark encoder's data-path. The image (N×N) pixels are preloaded in 'Image\_RAM'. The user can also add external input bus as the interface at 'Image\_RAM' for real-time implementation. The watermark and random key both are stored in different read-only memories (ROMs) which are considered as the input of the encoder. Lifting based 2-D DWT is performed on each image block of (512×512) sized. The output i.e. DWT coefficients are then restored back into the 'Image\_RAM'. Selected coefficients of 'Image\_RAM' are then modified by adaptive dither modulation technique as described in Algorithm 5.1. The detail discussion on watermark embedding hardware is given below.

# 5.3.1.1 Image RAM:

A dual port 'Image\_RAM' is designed to store image pixels, transitional DWT coefficients, and IDWT pixels. The 'Image\_RAM' is designed for (512×512) pixels location having the 8-bit depth of each pixel. Figure 5.4(a) and (b) show the top-level register-transfer level (RTL) view and the input-output pin details of 'Image\_RAM'.

Using a Dual port random-access memory (RAM) read and write can be done simultaneously, at the different memory cells. The different addresses are sorted by the two separate address buses, to increase the speed of read & write operations. Reading and writing are controlled by separate 'wr en' line.

The data storing and retrieving are done by input and output 8-bit data busses i.e. 'pixel\_in' and 'pixel', respectively. Through the prototype implementation of the encoder and decoder, we supply a text file of image pixel into 'Image\_RAM'.

For the real-time implementation of the scheme, image sensor module also can be added with this prototype circuit.

During the prototype implementation, we do not consider the interface of an image sensor to make the scheme simple and easy to implement.



Figure 5.3: Data-path for watermark encoder





Figure 5.4: (a) RTL view of Image\_RAM; (b) Input-output pin details of Image\_RAM

# 5.3.1.2 DWT/IDWT Block:

The 2-D DWT/IDWT is the major part of the encoder and decoder module. The proficient implementation of lifting based DWT decreases the implementation complexity, by reducing the number of arithmetic operations and memory accesses [209].

This, makes the lifting based DWT system suitable for applications having high throughput and low-power consumption. The hardware implementation of lifting based DWT algorithm is fast. Moreover, the IDWT is also simple.

The 1-D DWT is calculated by the transformation of the row for an image ( $512 \times 512$ ) matrix, based on Algorithm 2.1 as described in Chapter 2.

The hardware implementations of 2-D DWT and IDWT modules are implemented by [209]. By the behavioral model as described in VHDL language and synthesized by FPGA based implementation the hardware design of 2-D DWT and IDWT are realized.

The optimization is done in terms of resource utilization, speed, and power consumption of the hardware architecture of 2-D DWT/IDWT [209]. The RTL view of 2-D DWT is depicted in Figure 5.5(a). The port details are described in Figure 5.5(b).

The reverse process of DWT is the IDWT. The port configurations of DWT and IDWT are designed identically.

It is seen that the 9/7 filter based DWT module computes the DWT of the image ( $512 \times 512$ ) with a maximum frequency of 174.630MHz.



Figure 5.5: (a) RTL view of 2-D DWT; (b) Input-output pin details of 2-D DWT

#### **5.3.1.3 Dither Generation and Watermark Permutation Block:**

The essential Dither sequences are generated using Eq. 5.1 and 5.2. In the meantime, the permuted watermark sequence is also generated from the pre-stored watermark in 'watermark\_ROM'. Embedding is done inside the 'Dither Generation and Watermark Permutation' block. The RTL representation of this block is named as 'calculation\_block' and is shown in Figure 5.6(a). The input-output pin description of the calculation\_block is elaborated in Figure 5.6(b).

Based on the stored watermark in read-only memory (ROM) the permuted watermark is then generated. The calculated dither and the permuted watermark (W') both are then transferred to the 'embed\_block' to insert watermark bit. The data embedding is done by modulating the DWT coefficients.



Figure 5.6: Dither Generation and Watermark Permutation Block: (a) RTL view of calculation\_block; (b) Input-output pin details of calculation\_block. Note: 'Dither Generation and Watermark Permutation Block' is described as calculation\_block

#### 5.3.1.4 Embed Block:

The 'Embed\_block' is mainly a part of the calculation block. 'Embed\_block' modulates the particular coefficient as described in watermark embedding algorithm 5.3.

The modified watermarked coefficients are then transferred back to 'Image\_ RAM'.

The RTL view along with the pin details of the embed block are shown in Figure 5.7(a) and (b), respectively.







### **5.3.1.5 Encoder Control Unit Block:**

The FSM technology is used to design the 'Control Unit' of embedding process to provide synchronization and timing signal to each individual blocks.

This is done to perform the whole process (here the embedding) in a predetermined sequential manner.

In Figure 5.8(a) the 14 distinct sequences of states are designed to embed watermark bit at the selected DWT coefficients and also stored the values back into 'Image\_RAM'.

Figure 5.8(b) describes the various control signals that are generated by control unit block.

The process starts with 'start= high' trigger pulse. The 'ST1' and 'ST2' states calculate the 1st level DWT of the image ( $512 \times 512$ ) jointly by using active high 'Pixel\_valid\_512' signal.

In 'ST1' state the row-wise computation to calculate 1-D DWT is performed, whereas 'ST2' computes the column-wise calculation to complete 1st level 2-D DWT.

The second and third levels of DWT decomposition are computed in states 'ST3', 'ST4' and 'ST5', 'ST6' with active high 'Pixel\_valid\_256' and active high 'Pixel\_valid\_128' signal, respectively.

Based on the present algorithm, 'ST7' state generates the different dither and then modulates the DWT coefficients by dither modulation. 'ST8' to 'ST13' states compute IDWT jointly.



Figure 5.8: (a) Encoder control FSM; (b) Control signal map

After the completion of IDWT computation, the program control is return into the idle state. The dual port 'Image\_RAM' is accessed in each and every state for reading and writing operations.

#### **5.3.2 Watermark Decoder Datapath:**

Watermark decoding process is exactly the reverse process of embedding. The process is described in the Algorithm 5.4.

First, watermarked image is decomposed into 3-level DWT. Depending on the different step-size, the different combination of dither sequences i.e. 'D0' and 'D1' are generated on-chip.

The watermark extraction block (W\_Extraction) calculates the minimum distance from different arriving dither sequence and calculate the watermark bit.

The extracted watermark is then transferred to 'W\_ram'. IDWT is then performed on extracted coefficients to recover the original image pixels.

The data path of watermark extraction block is symmetrical to embed data path and is shown in Figure 5.9. The top-level RTL view, along with the pin details of 'W\_Extraction' unit is shown in Figure 5.10.

| Algorithm 5.4: Extraction algorithm                        |
|------------------------------------------------------------|
| Start                                                      |
| If $row(8:0)$ & $col(8:0)$ = address of modified pix then  |
| $S_A$ = [image_ram] - [(Y  [[image_ram] + (D0)]/Y ) - D0]; |
| $S_B = [image_ram] - [(Y  [[image_ram] + (D1)]/Y ) - D1];$ |
| If $S_A \ge S_B$ then                                      |
| Watermark = D1;                                            |
| Else                                                       |
| Watermark = D0;                                            |
| End if;                                                    |
| End if;                                                    |
| End                                                        |



FPGA Based Reconfigurable Hardware Architecture for Quality Access Control ...

Figure 5.9: The internal data-path for watermark decoder



(b)

Figure 5.10: (a) RTL view of W\_Extraction; (b) Input-output pin details of W\_Extraction

FPGA Implementation of Lifting Based Data Hiding Scheme for Efficient.....

#### 5.3.2.1 Watermark Decoding Control Unit Block:

Control unit for decoder is designed to offer genuine timing and also to control each and individual circuit component of the decoder module. The FSM (shown in Figure 5.11) performs the decoding process in fourteen distinct states.



Figure 5.11: Decoder control FSM

Decoding process starts with 'start= high' trigger pulse. FSM of decoder performs parallel operations like encoder control unit except 'ST7' state. The state 'ST7' performs the removal of watermark bit by computing SA and SB. Based on the extracted watermark bit i.e. 0 or 1, self-noise is suppressed from DWT coefficient.

The DWT coefficients are then stored back into the consequent location of the 'Image\_RAM'. Extracted watermark is reverse permuted and stored back in 'w\_ram'. Then IDWT is computed by 'ST8' to 'ST13' states.

Then decoded pixels can be read out from 'Image\_RAM' after the completion of IDWT. When the state 'ST13' complete then the program control becomes idle i.e. ST0. Dual port 'Image\_RAM' is updated at each and every state using proper read and write control signal.

#### **5.4 Performance Evolution:**

The evolution of the performance of the scheme is done by the experiments conducted using Intel(R) Core(TM) i3-4005U CPU, 1.7 GHz processor, with 8 GB RAM, 64 bit OS using Xilinx tools (ISE Design Suite 14.5 and Vivado 2014.2 design suite). To analyze the performance of the scheme a prototype hardware design is implemented on FPGA.

A brief discussion on the results of the proposed prototype hardware design in terms of throughput, power consumption, and component requirements is done in this part. Results are obtained for XILINX Zynq (XC7Z020-CLG484-1) series FPGA target board. Using XILINX ISE Simulator the timing simulations are done.

Moreover, the design is characterized in Vivado 14.2 design suite, by imposing different synthesis constraint. The VLSI implementation of watermark encoder and decoder for an 8-bit grayscale image of different sizes are tested.

This design may be extended for larger images (N×N), for real-time application, using full parallel processing or a composition of parallel & pipelined architecture. The cost of architecture design in FPGA is evaluated in terms of the basic entities like slice flip-flops, LUTs, multiplexers (MUX), and BRAMs etc.

The logic utilization and performance summary of different units are presented in Table 5.1 for image blocks of size  $(256\times256)$  and  $(512\times512)$ . The clock cycle requirement is also represented for watermark embedding and decoding process of different size image pixel matrix.

However, the encoding and decoding process is done separately. The efficiency of encoder and decoder in terms of data rate is calculated by Eq. (2.17) discussed in Chapter 2.

The encoder provides an embedding rate of 23.819 Mbps and the decoder provides the decoding rate of 23.835 Mbps, for  $(512\times512)$  image respectively. The overall power consumption of the proposed system is calculated based on the static and dynamic power. The static power depends on the resources used in the design [206].

The consumed dynamic power by a VLSI circuit in FPGA, depends on the effective capacitance of resources, the resource utilization and the switching activity of resources [182, 206]. The dynamic power of a circuit is represented by the effective capacitance (Cj), input clock frequency (fj) and voltage swing (VJ) of the j<sup>th</sup> resource.

The dynamic power is depicted by Eq. (2.15) as discussed in Chapter 2. The power consumption's estimation is done using XILINX-X Power Analyzer (XPA). The XPA is an interactive graphical tool, used to analyze the power consumption for XILINX FPGA devices.

The total analyzed powers are 78.52mW and 78.45 mW, for the encoding and the decoding system, respectively. According to this study, the proposed hardware architecture of watermark encoder and decoder blocks requires very low power.

FPGA Implementation of Lifting Based Data Hiding Scheme for Efficient.....

| Table 5.1: HDL synthesis summary of encoder and decoder: Resource utilization, |
|--------------------------------------------------------------------------------|
| Power consumption, Timing report, and Data rate.                               |
|                                                                                |

| Sr.<br>No. |                                                 | Encoder                 | Decoder               | Encoder              | Decoder              |  |  |  |  |
|------------|-------------------------------------------------|-------------------------|-----------------------|----------------------|----------------------|--|--|--|--|
| 1          | Target board details                            | Zynq (XC7Z020-CLG484-1) |                       |                      |                      |  |  |  |  |
| 2          | Image size                                      | (256×256) (512×512)     |                       |                      |                      |  |  |  |  |
| 3          | Number of Slice<br>LUT                          | 437                     | 429                   | 476                  | 456                  |  |  |  |  |
| 4          | Number of slice<br>registers                    | 208                     | 214                   | 230                  | 227                  |  |  |  |  |
| 5          | Number of fully<br>used LUTs                    | 110                     | 126                   | 157                  | 162                  |  |  |  |  |
| 6          | Bonded IOBs                                     | 12                      | 12                    | 12                   | 12                   |  |  |  |  |
| 7          | Number of BUFG                                  | 1                       | 1                     | 1                    | 1                    |  |  |  |  |
| 8          | Number of block<br>RAM                          | 16                      | 16                    | 64                   | 64                   |  |  |  |  |
| 9          | Number of DSP<br>48E1s                          | 2                       | 2                     | 2                    | 2                    |  |  |  |  |
| 10         | Maximum<br>frequency (MHz)                      | 120.013MHz              | 120.112MHz            | 130.094MHZ           | 130.191MHZ           |  |  |  |  |
| 11         | Power (static + dynamic)                        | 69.93 mW                | 69.55 mW              | 78.52mW              | 78.45mW              |  |  |  |  |
| 12         | Hardware<br>execution time<br>(N <sub>C</sub> ) | 0.0037792s              | 0.0037767s            | 0.0110054s           | 0.0109979s           |  |  |  |  |
| 13         | Total clock<br>required                         | 453688                  | 453642                | 1431883              | 1431832              |  |  |  |  |
| 14         | Data rate                                       | 17.3411<br>Megabyte/s   | 17.3522<br>Megabyte/s | 23.819<br>Megabyte/s | 23.835<br>Megabyte/s |  |  |  |  |

### Table 5.2: Throughput comparison table along with data embedding rate.

| Scheme                               | Design Type | Throughput        |
|--------------------------------------|-------------|-------------------|
| Maity and Kundu [ <u>71</u> ]        | FPGA        | 4.706 Megabits/s  |
| Maity and Maity [ <u><b>76</b></u> ] | FPGA        | 1.0395 Megabits/s |
| Das et al. [ <b>81]</b>              | FPGA        | 35.284 Megabits/s |
| Karri et al. [ <u>57</u> ]           | FPGA        | 137 Megabits/s    |
| Proposed Scheme                      | FPGA        | 23.827 Megabyte/s |

The throughput of the proposed method and comparison of the results with the related methods are shown in Table 5.2. According to results of Table 5.7, it is clear that the scheme

offers competitive throughput of 23.827 Megabyte/s, compared to the other FPGA based hardware implementations. Table 5.8 shows the comparative study of resource utilization and power consumption performance with similar works found in the current literature [76, 79, and 81]. The proposed scheme's resource utilization factor is very less due to the originality of designed architecture.

The proposed lifting DWT based access control hardware saves 89.536% and 22.29% of power than the others hardware-based implementations i.e. [79] and [76], respectively and requires only 78.48 mW of power, at an operating frequency of 130.14MHz.

It is also observed that the optimized design causes very less utilization of resource compared to the similar kind of DWT based implementation [79]. The effective power improvement is achieved due to the low power implementation strategy as mentioned in [184, 206]. The trade-off among area, power, and speed of design (in term of frequency) are investigated in Vivado 14.2 design suite. In FPGA based design, the area is represented by the number of slices or equivalent gate count.

Figure 5.12 depicts the trade-off between area, power, and frequency graphically. It is seen that the effective power consumption and area in terms of resource utilization are increased with the increase in frequency. So, there must be a settlement between the desired frequencies that to be achieved the required area and power consumption.



Figure 5.12: Frequency vs. Power and Area (in terms of resource utilization) i.e. Area/Resource vs. Frequency trade-off curve

The simulated waveform, for the selected portion of the proposed quality access control system, is shown in Figure 5.13(a), 5.13(b) and 5.13(c). Due to the symmetrical encoding and decoding systems, only the RTL view of the encoder is presented. Figure 5.13(a) shows the different dither sequences, along with the permuted watermark bits. The 'Input Pixel' line shows that the grayscale image pixels are transformed by DWT system and is represented by state 'S1'.

Figure 5.13(b) shows the encoded pixel and Figure 5.13(c) depicts the decoded pixel, respectively. The proposed design uses very low power and also offers great throughput with a negligible FPGA resource utilization that makes the proposed hardware design attractive for low power reconfigurable access control applications, compared to other hardware design.

|                |               |      | Γ.            |        |     |     |       |      | Γ.   |      |      |     |       | · · · |     |      |     |      |
|----------------|---------------|------|---------------|--------|-----|-----|-------|------|------|------|------|-----|-------|-------|-----|------|-----|------|
| 1 <sub>0</sub> | clk           |      | 1             | Ť      |     |     | ſ     |      |      | Ť    |      |     |       |       |     |      |     |      |
| 1 <sub>0</sub> | rst           |      |               |        |     |     |       |      |      |      |      |     |       |       |     |      |     |      |
| 1 <sub>0</sub> | start         |      |               |        |     |     |       |      |      |      |      |     |       |       |     |      |     |      |
|                | pixel_in[7:0] | 2    | 09            | $\sim$ | 19  | 8   | X     | 2    | 11   | x    | 19   | 3   | X     | 2     |     | X    | 2   | 7    |
| 0              | pixel_out[7:  |      |               |        |     |     |       |      |      |      |      |     | 0     |       |     |      |     |      |
| ų,             | state         |      |               |        |     |     |       |      |      |      |      | dw  | t_ro  | w_5   | 12  |      |     |      |
| 0              | w[0:23]       |      |               |        |     |     |       |      | 0    | 0110 | 010  | 101 | 1010  | 0101  | 10  | 110  | 010 | 1    |
| 0              | d0[0:23]      | 3,30 | 0 <b>,</b> 34 | 4,-5   | 5,2 | 5,- | 27, 1 | ,-23 | 3,52 | 2,72 | ,-2  | 0,1 | 6,-36 | 5,56  | ,-5 | 2,4  | 4,6 | 4,-  |
| 0              | d1[0:23]      | 8,2  | 4,2           | 8,-4   | 8,1 | 8,- | 20,   | -6,- | 16,  | 44,6 | 54,- | 12, | 8,-2  | 8,48  | ,-4 | 14,3 | 6,  | 56,- |

|             |         | i i li i i i i i i i i i i i i i i i i |         |           |          |          |       |
|-------------|---------|----------------------------------------|---------|-----------|----------|----------|-------|
| Ug clk      |         |                                        |         |           |          |          |       |
| 🗓 rst       |         |                                        |         |           |          |          |       |
| 🕼 start     |         |                                        |         |           |          |          |       |
| 📲 pixel_out | 193 150 | 251                                    | 236 225 | 5 155     | 238 / 19 | 0 178 18 | 5 220 |
| 🕼 wr_en     |         |                                        |         |           |          |          |       |
| 퉪 pixel_out |         |                                        |         |           |          |          |       |
| 🗓 state     |         |                                        |         | idwt_row_ | 512      |          |       |

| l |  | 1 |
|---|--|---|
|   |  |   |

(a)



(b)

(c)

Figure 5.13: HDL simulation results: (a) Different dither sequences along with the permuted watermark bits; (b) Encoded pixel; (c) Decoded pixel

## 5.5 Chapter Summary:

Hardware implementation based on Field programmable gate array (FPGA) of a data hiding scheme for proficient quality access control of images is proposed in this Chapter.

The hardware implementation of encoder module for a grayscale image of size ( $512 \times 512$ ) r using Zynq FPGA requires only 476 slices and 157 fully used LUTs, having 78.52 mW of power consumption. The watermark decoding process also utilizes only 456 slices and 162 fully used LUTs and require only a power consumption of 78.45 mW. Moreover, watermark embedding and extraction process require only 14,31,883 and 14,31,832 clock cycle, with the maximum frequency of 130.16 MHz, for a ( $512 \times 512$ ) sized image using Xilinx ISE 14.5 simulator.

The limitation of all the previously reported work in Chapter 3, Chapter 4, and Chapter 5 are the transform domain schemes. The main problem with those works is computationally complex. In view of hardware implementation, they are costly in terms of implementation time, resource utilization and speed of operation. Next chapter we have discussed a spatial domain access control scheme which is simple and easy to implement in FPGA.

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

# Chapter 6

# A Novel QIM Data Hiding Scheme and its Hardware Implementation using FPGA for Quality Access Control of Digital Image

### 6.1 Introduction:

The enormous progression in technology promotes the digital media components like images, text, audio, and video either can be accessed or duplicated and stored without any quality degradation. The manufacturers desire to place their large amount of valuable works on the website for publicity but they want to restrict the users from accessing the full quality, for their business benefits. A scheme is necessary to fulfill the above purpose. The scheme allows all the receivers to display an image (or information) at the low quality that has less commercial value and concurrently allows an image (or information) of higher quality in reference to the user access rights, which is defined by the subscription contract. The necessity of access control has grown extensively to the investigators in present years. Lots of data hiding methods are proposed for protecting copyright. Proprietorship affirmation and authentication are equally applied in access control for improvement of the security. The access control literatures are discussed in section 1.2.4 of Chapter 1. Usually, the Access control is performed in the transform domain by modulating the coefficients of the DCT, DWT, and its different variants employing a secret key. The transform domain method suffers from the huge computation. This affects the hardware implementation. To achieve better hardware performance, an easy but efficient data hiding scheme is essential. From the study of the previous work, it is revealed that most of the FPGA based hardware designs found in the existing literature are designs for the small-sized image which is not well-suited for quality access control scheme. Furthermore, the designs found in the literature have high opportunities to enhance the performance of interns in resource utilization, power consumption, and improved efficiency. In this chapter, we present a QIM data hiding method for quality access control of digital images of size (512×512) along with its FPGA based optimized hardware design. The hardware model's performance is presented here. The FPGA implementation of the prototype design is done to exploit the advantages of low investment cost.

The main characteristic of the system design are:

A. Increases System Performance in Term of Normalizing Cross-Correlation (NCC): The decision variable for each bit of watermark decoding is formed from the weighted average of N-decision statistics. The simulation results have shown that due to the embedding of watermark bit over N-mutually orthogonal signal points and detection of watermark bit from the weighted average of N-decision statistics, increases system performance in term of NCC (normalize cross-correlation) that leads better access control through the reversible process.

- **B.** *Larger Size Image:* Hardware can support up to  $(512 \times 512)$  sized image.
- **C.** *Minimum Utilization of FPGA Resource:* Image encoder and decoder module utilize minimum resource components of field programmable gate array (FPGA) devices like occupied slice, slice registers, slice LUTs, BRAMs and DSP blocks in comparison to the others hardware implementation. The system level implementation (encoder and decoder) of (512×512) size image requires only 237 and 389 number of slice LUTs (lookup table), 182 and 245 number of slice registers, respectively.
- **D.** *High Throughput:* The throughput of the prototype hardware achieves an average efficiency of 119.048 Megabyte/s with the operating clock speed of 120 MHz.
- **E.** *Extremely Low Power Consumption:* The optimized design caused the encoder and decoder architecture requires only 86.21 mW and 95.88 mW of power respectively.

The rest of the chapter is organized as follows: Section 6.2 describes the proposed access control scheme while in Section 6.3 the VLSI architecture of the proposed QIM algorithm is described. In Section 6.4 the performance evaluation of the proposed scheme is demonstrated and finally, chapter summary are drawn in Section 6.5.

#### 6.2 Proposed Access Control Scheme [64]:

In this section, we present the encoding and the decoding schemes of the proposed method.

#### A. Watermark Encoding:

The watermark encoding process of the proposed scheme is described in Figure 6.1. The stepwise encoding process is described below.



Figure 6.1: Block diagram of the watermarking process: encoder

#### **Step 1: Watermark Permutation process:**

Let us assume that the binary watermark  $(W) = \{b_1, b_2, \dots, b_N\}$ , where  $b_i \in \{0,1\}$  and  $i = 1,2,3 \dots N$ . the watermark are permuted (X-ORed) with a random key (k). The permutation process increases the security of the watermark.

A Novel QIM Data Hiding Scheme and its Hardware Implementation using FPGA.....

#### Step 2: Selection of Signal Coefficients for one Bit of Watermark Embedding:

The host image is projected on N-mutually orthogonal signal points in N-dimensional signal space by the help of Gram-Schmidt Orthogonalization process or by downsampling. The host signal  $(X) = \{X_1, X_2, \dots, X_N\}$  is the signal coefficients corresponding to complete orthogonal basis function set.

Theoretically, large N value is desired for selecting a mutually uncorrelated point that removes the inherent strong correlation that lies among the sample pixel [64].

Selection of the mutually orthogonal signal point increases the robustness of the proposed embedding scheme.

#### **Step 3: Watermark Insertion:**

a. Generation of Binary Dither for QIM: Two dither sequences, with length n, are generated pseudo-randomly using a 'key' with step sizes ( $\Delta$ ) as follows:

$$d_q(0) = \{\Re(key) \times \Delta\} - \Delta/2 \qquad 0 \le q \le n - 1 \tag{6.1}$$

$$d_q(1) = \begin{cases} d_q(0) + \Delta/2 & \text{if } d_q(0) < 0\\ d_q(0) - \Delta/2 & \text{if } d_q(0) \ge 0 \end{cases}$$
(6.2)

Here  $\Re(key)$  is a random number generator. Dither d(0) and d(1) are used for embedding watermark bit '0' and '1', respectively.

b) Watermark bit Insertion: In this method of the watermarking orthogonal host signal points are quantized with a step size (' $\Delta$ '), using a quantizer $Q_{\Delta}(.)$ , based on the message bit(*m*). The watermarked signal  $(X'_N)$  is considered as an *N*-dimensional vector  $\{X'_1, X'_2, X'_3, \dots, X'_N\}$ . The  $(X'_N)$  is represented as:

$$X'_{N} = Q_{\Delta} (X_{N} + k \times d(m)) - d(m); \quad m \in \{0, 1\}$$
(6.3)

Where, d(.) represents the dither sequence used for embedding watermark bit. The quality degradation factor is denoted by the factor k. In this design, we have used the degradation factor k = 2. It is found from the numerous independent watermarking experiments that this amount of degradation factor is sufficient for access control of image.

#### **B.** Watermark Decoding:

The decoding is the reverse process, and it is stepwise described below. The Figure 6.2 shows the watermark decoding scheme.

Step 1: First step performs the generation of binary dither, using the same step size ( $\Delta$ ) and key as used in the encoder.



Figure 6.2: Block diagram of the watermarking process: decoder

**Step 2:** Watermark Bit Extraction: The decoder receives the watermarked signal and performs segregation of the host image onto *N*-orthogonal signal points. Then they are requantized using  $m^{th}$  dither resulting in  $r^m = \{r_1^m, r_2^m, \dots, r_3^m\}$ . The decision variable  $(r_n^m)$  for the  $m^{th}$  dither at n<sup>th</sup> signal point is given by the equation below:

$$\boldsymbol{r_n^m} = |X_N' - Q_\Delta (X_N' + d(m)) - d(m)|; \quad m \in \{0, 1\}$$
(6.4)

The minimum mean square error combining (MMSEC) strategy is used to determine decision variable  $(D^m)$  for  $m^{th}$  watermark bit is given by:

$$D^m = \sum_{n=1}^N r_n^m w_{nm} \tag{6.5}$$

Where the weight factor  $w_{nm}$  is defined as,

$$w_{nm} = 1 - e^{(a|r^m|+b)} \tag{6.6}$$

The parameter values, a=0.5 and b=2.6 have been determined from a large number of independent trials for a large number of benchmark images. The decision variable  $(D^m)$  is feeded into the decision device (minimum distance decoder), the output of which determines the binary bit pattern. Thus the correlator's outputs generate a decision vector  $D = [D^1, D^2, ..., D^K]$  which is used to obtain the embedded bits  $b = \{\hat{b}_1, \hat{b}_2, ..., \hat{b}_N\}$ .

**Step 3:** Noise Cancellation for Access Control: The next step of self-noise extraction is performed by the following equation.

$$X_N^{"} = X_N' - (K - 1) \times d(m)$$
(6.7)

Here  $X_N^{"}$  is the watermarked signal after self-noise elimination. This process helps to get betta er quality of image.

Step 4: Decoding of Watermark Bit: The output of the second step is extracted watermarks (b) bits. Those extracted binary bits are reversely permuted (XORed) with the same random bits used during encoding process to get back the decoded watermark  $(\widehat{W})$ .

**Step 5:** Reliability for the Extracted Watermark after Decoding: We calculate the normalized cross-correlation (NCC) between the resized watermark image (W) and the decoded watermark  $(\widehat{W})$  to quantify the visual quality of the extracted watermark [202].

# 6.3 The VLSI Architecture of Proposed Access Control Scheme:

The VLSI implementation of the access control system is discussed in this section. The system hardware is implemented in XILINX Zynq FPGA (XC7Z020-CLG484-1).

The access control encoder and decoder data path are designed and optimized using Xilinx synthesis tool.

The implemented data path is described in very-high-speed-integrated-circuit-hardware-description language (VHDL).

# 6.3.1 Data Format:

The internal data bit representations for the inherent calculation of encoding and decoding process plays a vital role in hardware design.

Generally, hardware implementation utilizes fixed point signed 2's complement data format [1, 9 and 3]. It signifies that the 1st bit i.e. most significant bit (MSB) is treated as a sign bit. The next eight bits starting from 2nd bit position to 9th bit position represent the integer.

The fractional part is represented by the last three bits. The range of the data format [1, x, y] is set by Eq. 3.3 as depicted in Chapter 3.

The elected data range of -512 to 511.875 with a precision of 0.125 is selected, as the scheme is implemented for the greyscale image which does not need a wide dynamic range.

# 6.3.2 Data Path of Access Control Encoder:

This section elaborates the constructional details of the encoder data path. The data path is designed by integrating different circuit blocks together to perform the encoding operation.

The host image pixels  $(N \times N)$  are loaded in 'Image RAM'. The host image pixels are readout from N-orthogonal signal point. Then QIM technique is used to modulate the pixels.

The basic building block of the encoder module consists of 'Dither(d) and watermark (w) generation', 'Image RAM', 'Watermark embedding ', 'Random sequence generator', and 'Control unit' blocks. The construction of the encoder data path is shown in Figure 6.3.





**Figure 6.3: Data-path for the encoder** 

### 6.3.3 Dither (d) and Watermark (w) Generator:

The dither sequences and permuted watermark are generated in The 'Dither (d) and watermark (w)' generator block. The pre-stored watermark and the random key are stored in 'ROM 2' and 'ROM 1' respectively. The Eq. (6.1) & (6.2) are used to compute dither sequences with the help



Figure 6.4: RTL view of 'Dither (d) and watermark (w) generator' block along with the port details

Of a random key. The calculated dither sequences  $(d_0, d_1)$  and the permuted watermark (W') are then transferred to the 'Watermark (w) embedding block'. The Figure 6.4 shows the RTL view and the functional details of each input-output ports of 'Dither (d) and watermark (w) generator' block.

A Novel QIM Data Hiding Scheme and its Hardware Implementation using FPGA.....

# 6.3.4 Image RAM:

Image RAM is used to store 8-bit greyscale image pixels and also the QIM based modulated pixels. This system utilizes a dual port 'Image RAM' having a specification to store 262144 pixels to process (512×512) image block for the quality access control. Dual port random-access memory (RAM) read and write can be done simultaneously, at the different memory cells.

There are two different address line i.e. 'ad1' and 'ad2' that performs the reading and writing operations, simultaneously. The control input in 'w\_r' control line manages the read-write operation.

The data storing and reading are performed by 8-bit data bus i.e. 'pix1' and 'pix2'. The user can interface an image sensor for the real-time implementation of the scheme. The prototype hardware implementation of the system does not consider the interface of the image acquisition system to make the design simple.



Figure 6.5: RTL view of 'Image RAM' along with the port details

The RTL view of the dual port 'Image RAM' is presented in Figure 6.5. The input-output port details are reported beside the RTL view of Figure 6.5.

# 6.3.5 Random Sequence Generation:

This block generates pseudo-random number (PRN) sequences. The linear feedback shift registers along with a combinational logic based feedback mechanism are used to generate random binary sequence [73].

When the 'start random' signal is asserted, the 'random sequence generator' block produces 64 different random numbers at its output port.

The random number is asserted as an input to the address counter to generate the random address sequences.



Figure 6.6: RTL view of 'Random sequence generator' block along with the port details

The 'Random sequence generator' block's RTL and the input-output port details are shown in Figure 6.6.



Figure 6.7: RTL view of 'Watermark (W) embedding' block along with the port details
The Figure 6.7 shows the RTL view and the port details of the 'Watermark (W) embedding' block.

### 6.3.6 Control Unit:

The 'Control unit' of quality access control encoder is controlled by a finite state machine (FSM) having five distinct states as shown in Figure 6.8. Different input drives the FSM state to generate active high output signals and activates *en-1*, *en-2*, *and en-3*. The encoding process starts with 'reset=0 and start=high' signal. The start random sequence is used to drive the different modules of the encoder block. The dither sequences and the permuted watermark are computed in state 'ST1'.

The random addresses sequences are generated in state 'ST2'. The pixel modulation is performed in state 'ST3'. The last state 'ST4' performs the memory read operation to display the modulated host image.



Figure 6.8: FSM controlled encoding states and control signal

### 6.3.7 Data Path of Access Control Decoder:

The decoder unit performs reverse operation to extract original host pixel from the encoded pixel. The extraction process is explained in Algorithm 6.2. Firstly, the encoded image pixels are read out from 'RAM\_Image' and are processed for watermark extraction.

| Algorithm 6.2: Extraction Algorithm                         |
|-------------------------------------------------------------|
| Inputs: D0, D1, the randomly selected pixel from RAM_Image. |
| Start                                                       |
| If RAM_Image location = random sequence then;               |
| N0=[RAM_Image] - $[(Y  [[RAM_Image] + (D_0)]/Y]) - D_0];$   |
| N1=[RAM_Image] - $[(Y  [[RAM_Image] + (D_1)]/Y ) - D_1];$   |
| If N0 $\geq$ N1 then                                        |
| Watermark = $D_1$ ;                                         |
| Else If N0 <n1 td="" then<=""></n1>                         |
| Watermark= $D_0$ ;                                          |
| End if;                                                     |
| End if;                                                     |
| End                                                         |
| Outputs: extracted watermark.                               |

The decoder unit consists of on-chip 'Dither (d) generator block'. This block generates two dither sequences i.e. 'D<sub>0</sub>' and 'D<sub>1</sub>'. The 'watermark extraction' block accepts different dither sequences and calculates the minimum distance to predict the watermark. Predicted watermark is extracted and the pixel is transferred back to the 'RAM\_Image'. The N-Orthogonal signal points are generated by the same random sequence generator as used in encoder module. The control unit of the decoder block generates different input control signals like *clock, reset, start, stop1, stop2*, and *stop3* to perform watermark extraction. The *en-1* is generated from the control unit to activate the dither generator block. The 'Random sequence generator' and 'watermark extraction'.



Figure 6.9: (a) Data-path for Decoder; (b) Pipelined Decoder

|                                                                                     | Port name                           | Port type     | Port Description                                                        |
|-------------------------------------------------------------------------------------|-------------------------------------|---------------|-------------------------------------------------------------------------|
|                                                                                     | clk, rst, w_en                      | Input         | Clock, reset and enable signal to store watermark.                      |
| w_addr(5:0) clk en2 rst stop2 w_en<br>Watermark extraction and noise<br>cancelation | en2, stop2                          | Input         | Watermark extraction block<br>enable, calculation stop<br>signal.       |
| <b>p</b> ix1 (7:0) <b>p</b> ix2 (7:0)                                               | D0(7:0),<br>D1(7:0),<br>w_add (5:0) | Input bus     | Dither sequences D0, D1 and<br>extracted watermark storing<br>address . |
| · ·                                                                                 | pix1 (7:0),                         | Bidirectional | Pixel in out bus.                                                       |

# Figure 6.10: RTL view of 'Watermark extraction & noise cancelation' block along with the port details

The block is controlled by asserting the control signal *Start Random*, *Stop Random*, *and en-*2. The watermark extracted pixels are read out with the activation of *en-3*. The construction of the decoder data path is symmetrical to the embedding data path except for the 'Watermark extraction' and 'Control unit' as shown in Figure 6.9(a).

The construction of 'Dither (d) generator', 'Random sequence generator' and 'RAM\_Image' (as in encoder) are exactly equivalent as described in the encoder section. The Figure 6.10 shows the RTL view of the 'Watermark extraction' & noise cancelation block.

### 6.3.8 Pipelined Architecture of Watermark Extraction & Noise Cancelation:

The watermark extraction block also implemented with three stage pipeline structure to improve the system performance. The Figure 6.9(b) depicts the pipeline architecture of the watermark extraction and noise cancelation.

The pipelining process breaks the original decoding process into shorter stages separated by flip-flops, thereby increase the speed-up, but at the same time, it increases the latency in terms of the number of clock cycles.

Moreover, the pipelining technique decreases the dynamic power consumption in FPGA based design [184, 206]. Lastly, pipelining process helps synthesizer tool for fastest placement. The process also minimizes the number of longer nets and resources.

That intern helps the minimization of effective FPGA resource. The Figure 6.11 shows the RTL view of the pipelined decoder. To remove the watermark and to produce the high quality decoded pixel, the inputs of the pipelined watermark extraction block are 'D<sub>0</sub>', and 'D<sub>1</sub>'.



Figure 6.11: RTL view of pipelined decoder block along with the port details

### **6.3.9 Pipelined Decoder Control Unit:**

The pipeline decoder is controlled by a well-defined FSM. The FSM is shown in Figure 6.12.





Figure 6.12: FSM for the pipelined decoder

FSM consists of five different states. The decoding operation starts with the state, 'ST0'. 'Random sequence generator' blocks start generating random sequence in state 'ST1'.

The third state 'ST2' removes the watermark from the encoded pixel and stores them (decoded pixel and watermark) in the 'RAM\_Image' by pipelining process.

The pipelining process is initiated with the active high signal at the pin *'en\_pipe'*. The state 'ST3' confirms that the decoding of all pixels is completed by initiating 'stop random=1'.

At last the state 'ST4' performs the pixel read operation from the 'RAM\_Image' by asserting 'en\_pix=1' signal. The active high 'stop2' signal indicates that the reading operation has been successfully performed and then the state of the machine advances to the initial state.

### 6.4 Performance Evolution and Discussion:

The present section represents the performance of the system. The system performances are analyzed using two different simulation platforms that are MATLAB and Xilinx tools. Firstly, the effectiveness of the proposed algorithm is tested over a large number of benchmark images. Lastly, the FPGA based prototype is designed is tested and characterized using Xilinx ISE 14.5 synthesis tool.

#### **6.4.1 Software Simulation:**

The MATLAB based performance evolution of the proposed spatial domain access control algorithm is tested over a large number of benchmark images (grayscale) of size (512×512) having varied image characteristics.

Among them, a few selective test images are shown in Figure 6.13 (a-d). Spatial domain is selected because of its easy and effortless watermark embedding capability. At the same time, this domain of watermarking offers very low computational load.

A Novel QIM Data Hiding Scheme and its Hardware Implementation using FPGA.....



Figure 6.13: (a)Test image Lena; (b) Test Image Pepper; (c) Test Image Home; (d) Test Image Baboon; (e) watermarked image in spatial domain; (f) decoded image of (a). (P, M) above each image represents the PSNR (in dB) and MSSIM values of the image. (Image database cited on 17/08/2018:

http://sipi.usc.edu/database/database.php?volume=misc&image=11#top)

The proposed algorithm uses the dither of length 63 and the separation between two consecutive pixel/coefficients that are used for one bit of watermark embedding is of 4096. In our design, we consider the value of step size 'delta' ( $\Delta$ ) is as 20. The effectiveness of the present study compares the peak-signal-to-noise-ratio (PSNR (dB) [210] and mean-structure-similarity-index-measure (MSSIM) [179] values for the watermarked image with respect to the original image. The security measure of the watermarked image is established by computing KLD [211]. A brief discussion regarding these above-mentioned parameters is discussed in Chapter 2.

The PSNR (dB) and MSSIM values of watermarking (before and after) process are depicted in Table 6.1. The various test images standard test image is shown in Fig. 6.13 (a-d) among them four different test images are illustrated in Table 6.1. It is clear that the user having the subscription key can enjoy a better quality of the image as the tabulated PSNR (dB) value is higher as shown in the 6<sup>th</sup> column of Table 6.1. A normal user without the knowledge of the key can view poor quality images with PSNR (dB) as shown in Column 2 of Table 6.1. However, a user with a valid key can decode a better quality image with PSNR (dB) as shown in the 6<sup>th</sup> column.

|               | Before Decoding |       |     |           | After 1      | Decodin |
|---------------|-----------------|-------|-----|-----------|--------------|---------|
| Name of Image | PSNR<br>(dB)    | MSSIM | NCC | KLD       | PSNR<br>(dB) | MSSI    |
| Lena          | 30.45           | 0.83  | 1   | 0.0000083 | 35.55        | 0.90    |

Μ

0.90

0.92

0.96

Table 6.1: PSNR (dB) and MSSIM before and after the decoding process.

0.83

0.87

0.91

30.61

30.51

30.43

Pepper Home

Baboon

Figure 6.13 (e) is the watermarked image after embedding watermark of size  $(32 \times 32)$ . Figure 6.13 (f) is the same after self-noise suppression, respectively.

1

1

1

0.0000078

0.0000080

0.0000076

35.68

35.65

35.62

The tabulated values of Table 6.1 shows that the spatial domain approach offers good image fidelity after watermark insertion. Moreover, the image characteristics do not have a greater impact on the fidelity of the watermarked image. The extracted watermark in the spatial domain is shown in Figure 6.14 (a). Some typical signal and image processing operations are performed to test robustness. The numerical values in Table 6.2 are taken as the average value of 100 individualistic experiments carried out over a large number of benchmark images having diversified image characteristics. We have also studied the robustness of the proposed scheme for cropping.



Figure 6.14: (a) Extracted watermark in the spatial domain; (b) Results for cropping; (c) Extracted watermark from (b).

Experimental results in Figure 6.14(b) & (c) show that the scheme returns promising results for cropping as watermark bits are inserted in disjoint position.

| Strength                          | Domain  | NCC  | Strength                       | Domain  | NCC  |
|-----------------------------------|---------|------|--------------------------------|---------|------|
| Median Filtering (3×3)            | Spatial | 0.96 | Salt & Pepper Noise<br>(0.005) | Spatial | 1    |
| Mean Filtering (3×3)              | Spatial | 0.75 | Salt & Pepper Noise<br>(0.009) | Spatial | 1    |
| Highpass Filtering (1.8)          | Spatial | 0.43 | Speckle Noise (0.001)          | Spatial | 0.76 |
| Down & Up Sampling<br>(0.90)      | Spatial | 0.97 | Speckle Noise (0.005)          | Spatial | 0.56 |
| Down & Up Sampling<br>(0.75)      | Spatial | 0.89 | Speckle Noise (0.009)          | Spatial | 0.67 |
| Histogram Equalization            | Spatial | 0.45 | Gaussian Noise<br>(0.001)      | Spatial | 0.55 |
| Dynamic Range Change<br>[50- 200] | Spatial | 0.33 | Gaussian Noise<br>(0.005)      | Spatial | 0.55 |
| Salt & Pepper Noise<br>(0.001)    | Spatial | 1    | Gaussian Noise<br>(0.009)      | Spatial | 0.55 |

|                    |                | 1/1 1100 /     | • • • •      | •            |           |
|--------------------|----------------|----------------|--------------|--------------|-----------|
| Table 6.2: Results | of test images | with different | image/signal | processing o | perations |

A Novel QIM Data Hiding Scheme and its Hardware Implementation using FPGA.....

Table 6.3 shows the performance comparison in PSNR (dB) and NCC values for conventional QIM based detection scheme [202] and the proposed algorithm after various types of common image and signal processing operations. It is found that the proposed scheme returns a superior result over the mentioned paper [202]. This is due to the fact that the proposed method embeds watermark bits over N-mutually orthogonal signal points and detects the watermark bit from the weighted average of N-decision statistics. This result in the decoding of watermark bit is more accurate even after different signal manipulation and is confirmed by high NCC values. The comparative result in term of average embedding capacity and PSNR for different data hiding method is depicted in Table 6.4.

|                      | Conventional QIM Dete | Proposed |           |      |
|----------------------|-----------------------|----------|-----------|------|
|                      | PSNR (dB)             | NCC      | PSNR (dB) | NCC  |
| Median (3×3)         | 30.29                 | 0.63     | 32.78     | 0.96 |
| Mean (3×3)           | 27.92                 | 0.68     | 30.64     | 0.75 |
| Downsampling (0.9)   | 31.96                 | 0.86     | 33.91     | 0.97 |
| Downsampling (0.75)  | 30.94                 | 0.75     | 32.92     | 0.89 |
| Salt & paper (0.001) | 31.80                 | 0.95     | 32.26     | 1    |

#### Table 6.3: Performance comparison in PSNR (dB) and NCC.

It is seen that the proposed method offers betterment in PSNR values than the related work. It is also to be noted that though the embedding capacity is low then the related work but that amount of embedding capacity is sufficient to regulate the quality of access control effectively.

| Table 6.4: Comparison | of results in terms | of the embedding | method, domain, | fidelity, |
|-----------------------|---------------------|------------------|-----------------|-----------|
| capacity, technique.  |                     |                  |                 |           |

| Related<br>Work              | Embedding<br>Method           | Domain               | PSNR<br>(in dB) | Embedding<br>Capacity<br>(Bits) | Technique                                         |
|------------------------------|-------------------------------|----------------------|-----------------|---------------------------------|---------------------------------------------------|
| Chen et<br>al. [ <u>212]</u> | Block<br>Truncation<br>Coding | Compressed<br>Domain | 31.33           | 17146                           | Reversible Data<br>Hiding                         |
| Lo et al.<br>[ <u>60]</u>    | Histogram<br>Shifting         | Compressed<br>Domain | 32.10           | 4180                            | Reversible Data<br>Hiding                         |
| Sun et al.<br>[ <u>55]</u>   | Joint Neighbor<br>Coding      | Compressed<br>Domain | 31.60           | 64008                           | Reversible Data<br>Hiding                         |
| Kim et al.<br>[ <u>56]</u>   | Histogram<br>Modification     | Compressed<br>Domain | 33.92           | 148506                          | Reversible Data<br>Hiding                         |
| Proposed                     | QIM                           | Special              | 35.55           | 1024                            | Semi-fragile<br>Data Hiding for<br>Access Control |

## 6.4.2 Hardware Realization Using FPGA:

This section presents a brief hardware simulation outcome of the prototype implementation of the access control system.

Moreover, to improve the performance of the decoder, a pipelined architecture is designed by hardware description language (HDL) and achieves a maximum clock speed of 120 MHz.

The FPGA implementation (XC7Z020-CLG484-1) is characterized based on FPGA resource requirements like a configurable logic block (CLB), look-up-table (LUT), Flip-Flops, DSP 48E1s, block RAM (BRAM) etc are presented in this section.

We also extended investigation for the power consumption and throughputs are also presented in this section.

The timing simulations and characterization are done via XILINX ISE 14.5 synthesis tools and Vivado 14.2 design suite, respectively.

The hardware implementation cost is calculated in terms of basic resource utilization of FPGA and is epitomized in Table 6.5.

| Sr. |                                 | Encoder   | Decoder                 |           |             |           |           |  |
|-----|---------------------------------|-----------|-------------------------|-----------|-------------|-----------|-----------|--|
| No. |                                 | (Serial)  | (Sei                    | rial)     | (Pipelined) |           |           |  |
| 1   | Target board details            |           | Zynq (XC7Z020-CLG484-1) |           |             |           |           |  |
| 2   | Image size                      | (256×256) | (512×512)               | (256×256) | (512×512)   | (256×256) | (512×512) |  |
| 3   | Number of slice LUT             | 169       | 237                     | 232       | 389         | 244       | 456       |  |
| 4   | Number of slice registers       | 108       | 182                     | 165       | 245         | 148       | 227       |  |
| 5   | Number of<br>fully used<br>LUTs | 95        | 121                     | 115       | 144         | 127       | 162       |  |
| 6   | Bonded IOBs                     | 11        | 12                      | 11        | 12          | 11        | 12        |  |
| 8   | Number of<br>block RAM          | 16        | 64                      | 16        | 64          | 16        | 64        |  |
| 9   | Number of<br>DSP 48E1s          | 1         | 1                       | 1         | 1           | 1         | 1         |  |

| Table 6.5: Resource utilization summary of the encoder and pipelined |
|----------------------------------------------------------------------|
|----------------------------------------------------------------------|

The total timing report (operating frequency, hardware execution time and total clock cycle) of the access control system for image size  $(512\times512)$  is tabulated in Table 6.6. The hardware system performs very fast with minimum hardware execution time. This has a great advantage on the real-time application of the system. The encoder and decoder circuit are implemented separately.

The data rate  $(D_r)$  is calculated by the equation (2.17) as mentioned in Chapter 2. Accordingly, the encoder provides a data rate of 119.54 Megabyte/s whereas the pipelined decoder achieves a data rate of 119.048 Megabyte/s. The power consumption of the system is calculated by the sum of static (S) and dynamic (D) power consumption. The static power of a system is consumed for the utilization of the FPGA resource [206].

| Sr. |                                           | Enc       | ncoder Decoder |           |           |             |            |  |
|-----|-------------------------------------------|-----------|----------------|-----------|-----------|-------------|------------|--|
| No. |                                           |           |                | (Serial)  |           | (Pipelined) |            |  |
|     | Image size                                | (256×256) | (512×512)      | (256×256) | (512×512) | (256×256)   | (512×512)  |  |
| 1   | Maximum<br>frequency<br>(MHz)             | 120.114   | 120.013        | 120.02    | 119.626   | 120.85      | 120.01     |  |
| 2   | Hardware<br>execution<br>time<br>(second) | 0.0005541 | 0.0021928      | 0.0005547 | 0.0022002 | 0.0005597   | 0.00220199 |  |
| 3   | Total clock required                      | 65536     | 263171         | 66587     | 263203    | 67646       | 264262     |  |
| 4   | Data rate<br>(Megabyte/s)                 | 118.26    | 119.54         | 118.12    | 119.14    | 117.08      | 119.048    |  |

 Table 6.6: Timing report and data rate.

The dynamic power consumption of an FPGA based system is depended on the effective load capacitance C', switching frequency f' and voltage swing  $V^{2'}$ . It is represented by the equation (2.15) as mentioned in Chapter 2.

 Table 6.7: The synthesized power consumption report

| Image<br>Size | Power Consumption<br>Summary of Serial |       | Power Consumption Summary of Decode (in $\mu W$ ) |        |                |        |       | coder                    |       |  |
|---------------|----------------------------------------|-------|---------------------------------------------------|--------|----------------|--------|-------|--------------------------|-------|--|
|               | Encoder (in µW)                        |       |                                                   | Serial | Serial Decoder |        |       | <b>Pipelined Decoder</b> |       |  |
|               | S                                      | D     | Т                                                 | S      | D              | Т      | S     | D                        | Т     |  |
| (256×256)     | 17.77                                  | 64.08 | 78.84                                             | 18.53  | 76.14          | 94.67  | 18.53 | 66.73                    | 85.26 |  |
| (512×512)     | 18.03                                  | 68.18 | 86.21                                             | 18.91  | 90.16          | 109.07 | 18.83 | 77.05                    | 95.88 |  |

\* S=Static power, D=Dynamic power, T=Total power, mW=Mill watt.

The system power consumption is analyzed in XILINX X-Power analyzer tool. The power consumption report is tabulated in Table 6.7. The power results of the serial decoder module show that the consumption of the total power is 109.07 mW (static, dynamic). This is found to be more power hungry than the serial encoder module. The total system power consumption is the summation of static and dynamic power consumption. It is to be noted that for serial decoding system 83% of the consumed power is the dynamic power. It is worthless to be mentioned here that the pipeline architecture is an efficient approach to decrease the dynamic power consumption [79, 81].

To reduce the dynamic power consumption we have designed pipelined decoder architecture which shows that the pipeline architecture of the access control decoder saves 14.54% of dynamic power and consumes only 95.88 mW of total power.

| Scheme                       | Design Type | Throughput         |
|------------------------------|-------------|--------------------|
| Maity and Kundu [71]         | FPGA        | 4.706 Megabits/s   |
| Maity and Maity [76]         | FPGA        | 1.0395 Megabits/s  |
| Das et al. [ <u>81</u> ]     | FPGA        | 35.284 Megabits/s  |
| Mandal et al. [ <u>199</u> ] | FPGA        | 11.39Megabyte/s    |
| Proposed Scheme              | FPGA        | 119.048 Megabyte/s |

 Table 6.8: Comparison of results by means of throughput.

| Table 6 9.  | Comparison o | f results for  | resource | utilization | and r | nwer c  | onsum | ntion |
|-------------|--------------|----------------|----------|-------------|-------|---------|-------|-------|
| 1 able 0.7. | Comparison   | 1 1 650115 101 | resource | uunzauon a  | anu p | JUWEI C | onsum | Juon. |

|                                       | Α         | В             |                   | С     | D     | E     | F   | G     | Н       |  |
|---------------------------------------|-----------|---------------|-------------------|-------|-------|-------|-----|-------|---------|--|
| Maity<br>and<br>Maity<br>[ <u>76]</u> | (32×32)   | Spatial       | Embedding         | 9881  | 9347  | 11291 | 3   |       | 98.7    |  |
|                                       |           |               | Decoding          | 14600 | 12531 | 28753 | 3   | 750   |         |  |
| Darji et<br>al. [ <u>79]</u>          | (512×512) | DWT<br>(Haar) | Embedding<br>(P)  | 1729  | 900   | 3153  | 305 | 103   | 97.92   |  |
|                                       |           |               | Embedding<br>(PR) | 2002  | 1027  | 3453  | 305 | 99    | 07.02   |  |
| Das et al.<br>[ <u>81]</u>            |           | DE            | Encoder           | NA    | 20    | 90    | NA  | 100   | 176.24  |  |
|                                       |           |               | Decoder           | NA    | 20    | 90    | NA  | 200   | 173.93  |  |
| Proposed<br>Scheme                    |           | Special       | Encoder           | 366   | 182   | 237   | 64  | 86.21 | 120.013 |  |
|                                       |           |               | Decoder (P)       | 395   | 227   | 456   | 64  | 95.88 | 120.01  |  |

DE=Difference expansion, P=Pipeline implementation, PR= Parallel Implementation, A= Image size, B= Working domain, C= Occupied slice, D= Slice registers, E= Slice LUTs, F= BRAMs, G= Total power (mW), H= Clock (MHz). The superiority of the architectural design is explained in Table 6.8 and Table 6.9. The throughput comparison of the proposed scheme is depicted in Table 6.8. The throughput of the proposed scheme found to be very high in comparison to the popular implemented method found in the literature. The comparison result of the implemented hardware architecture in terms of resource utilization and power consumption is depicted in Table 6.9. It is observed that the proposed scheme returns superior utilization factor than the other schemes. The performance of the scheme is optimized in term of frequency, power consumption, and area. The optimization is done by imposing different synthesis parameter constrain using Vivado 14.2 design suite synthesis tool. It is observed that in FPGA based design the power consumption and resource utilization are increased with the increase in frequency. Therefore, a trade-off among frequency, resource utilization and power consumption required to be obtained. Figure 6.15 shows the trade-off among the said parameters. The above discussion shows that the proposed prototype implementation is superior in terms of logic utilization, power consumption, throughput and operating frequency than the related works found in the literature.



Figure 6.15: Frequency vs. Power trade-off curve and Area (in terms of resource utilization) /Resource vs. Frequency trade-off curve



(a)





(c)

Figure 6.16: Hardware description language (HDL) simulation results: (a) Dither 'd0', 'd1' and the permuted watermark bits; (b) Encoded image pixels, (c) Decoded image pixels

A selective portion of the Xilinx ISE simulation results is shown in Figure 6.16 (a-c). The Figure 4.16(a) shows the dither sequence 'D0', 'D1' and, the permuted watermark. Figure 6.16 (b) depicted the encoded pixels. The decoded image pixels are shown in Figure 6.16 (c). It is found that the decoded pixels having an error probability by up to 5%. This is due to the fact of the quantization factor as described in Algorithm 1. The error probability can be minimized by reducing the step size ( $\Delta$ ). The presented result shows that the scheme achieves higher throughput and consumes very low power consumption. This makes the hardware design attractive for portable image access control applications compare to the other similar kind of hardware design present in the current literature.

### 6.5 Chapter Summary:

This chapter we have implemented a data-hiding scheme for quality access control of digital image using QIM. FPGA based hardware is also implemented where each watermark bit is spread over N-mutually orthogonal signal points. The efficiency and novelty of the scheme are tested for both software and hardware platform. The experimental result shows that the scheme is robust for the wide range of common image processing operations and attacks especially cropping. On the other hand, the hardware simulation of an 8-bit grayscale (512×512) image encoder using Xilinx (XC7Z020-CLG484) FPGA requires only 237 slices LUT and 182 slice registers having 86.21mW power consumption with an operating frequency of 120.013MHz. It is also seen that the pipelined decoder implementation of the scheme utilizes 456 slices LUTs and 227 slice registers having the power consumption of 95.88 mW. With an operating frequency of 120.01MHz.

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

## Chapter 7

## Mismatch Considerations in Laser Diode to Single-Mode Circular Core Triangular Index Fiber Excitation via Upside Down Tapered Hemispherical Microlens on the Fiber Tip

### 7.1 Introduction:

The most suitable technique in the field of coupling optics is a fabrication of microlens on an optical fiber as a coupling device for maximizing the coupling efficiency [102, 104, and 140]. Different types of microlenses are being made-up on the tip of the optical fiber. In spite of having many types of microlenses, conical or hemispherical shaped microlenses are being exercised most widely, as they have self-centering characteristics [104]. It is shown that on a fiber the most effective microlens in terms of coupling is hyperbolic microlens, for it produces theoretical coupling effectiveness around 100% [103, 140, 213, and 214] at a particular focal length.

On the other hand, though hemispherical microlens is slightly less efficient as a coupler [104, 140], it is commonly being used because of its easy fabrication. Further, the fabrication of upside down tapered microlens on optical fiber tip also demands special attention in optical communication system [109, 122, 125-127, 130, 149, 153, 155, 215, 216]. A simple photolithographic technique can be used in order to make the formalism of hemispherical microlens on the tip of the fiber. However, the coupling efficiency is strictly weakened due to the spherical deviation, mode mismatch and limited aperture which is responsible for its incapability to collect the available radiation from the laser source [102, 140].

On the other hand, upside down tapered hemispherical microlens (UDTHML) on the fiber end is made by tailoring the fiber end slowly into the large cross-section, and it is to allow most competent acceptance of laser light. If the fiber is heated up to the softening temperature of the material, it causes the formation of an upside down taper lens inside a shaping mold owing to surface tension [127]. This type of lens not only presents efficient coupling of the light source to optical fiber but also its large aperture has been successfully engaged in large span optical fiber sensors and other micro-optical systems [125, 126].

After analyzing different types of the microlens on the tip of various types of the optical fiber by well-known ABCD matrix method it has been found that it is not only very precise but also very simple. [105, 107, 117, 119, 153]. Further, ABCD matrix exemplifies a particular optical system and ABCD transformation matrix for hemispherical microlens on fiber tip [107] differs from the UDTHML drawn on either triangular index fiber [122] or on the fiber of any other refractive index profile.

The theoretical prediction of coupling optics which has involved laser diode to single-mode circular core triangular index fiber by ABCD matrix formalism has been accounted recently [122]. Motivated by the excellent correctness in the said prediction, we apply relevant ABCD matrix formalism in order to assess the coupling efficiencies in presence of possible mismatches for this kind of coupler. The theory includes some repetition [122] since our study of misalignment also comprises evaluation of coupling efficiencies in absence of the said two mismatches namely d1=0 and d2 = 0 and  $\theta$  =0 and this has to agree with the values of coupling efficiencies in absence of mismatches [122].

Moreover, the evaluation of coupling efficiency in presence of said two mismatches involves calculation of w2x, w2y, R2x, and R2y and this needs the application of ABCD matrix appropriate for the system [122]. The optical fibers which are made of silica have minimum attenuation loss at wavelength of 1.55  $\mu$ m while its material diffusion vanishes at the wavelength 1.3 $\mu$ m [99]. In this circumstance, it is worth mentioning that in the case of triangular-index fibers attenuation loss is around 0.21 dB/km at the wavelength 1.55 $\mu$ m [217]. Side by side, a triangular-index fiber, which happens to be the simplest kind of dispersion-shifted fiber, has a low value of micro and macro-bending loss [217]. But the main disadvantage of this fiber is the low cost of the first higher order mode cut-off wavelength which lies in between 0.85  $\mu$ m and 0.90  $\mu$ m [218]. Therefore, the operating wavelength is required to be far away from the cut-off wavelength for such types of fibers. Hence two operating wavelengths namely 1.3  $\mu$ m and 1.5 $\mu$ m are selected accordingly.

In this chapter, the use of ABCD matrix formalism for such type of coupler for the formulation of systematic expressions of coupling efficiencies in presence of both transverse and angular mismatches is reported. It is also reported how the prescribe formulations lead to an accurate forecast of the concerned coupling efficiencies in a simple fashion. The results found will be helpful to the packagers who are dealing with such a coupling device in the area of launch optics.

### 7.2 Theory:

The Fig. 7.1 shows the upside-down tapered coupling device. The figure 7.1 shows that the laser diode source is placed at plane-1. The laser light is refracted through upside down tapered hemispherical microlens to the end face of the fiber. The end face of the fiber is represented by plane 2 in the figure 7.1.

The incident light beam to be propagating along Z-direction with the end face of the fiber lying in the X-Y plane. We consider the refractive indices of incident medium and the lens are represented by n1 and n2 respectively.

In our investigation, we employ some common approximations, specifically no transmission loss, polarization matching of the fiber field with the laser field and Gaussian field distributions of both the source and fiber [102, 104, and 140].

The Gaussian spot sizes of the elliptical intensity profile of the optical beam emitted from the laser diode along two mutually perpendicular directions are represented by w1x and w1y with x and y-axis being respectively parallel and perpendicular to the junction plane.





Figure 7.1: Diagram of upside down tapered hemispherical microlens coupler on the end of a single-mode circular core triangular index fiber

The field  $(\Psi_u)$  of the output beam of the laser diode at a distance 'u' from the surface of the lens can be expressed as [152, 219, 220],

$$\Psi_{u} = \exp\left[-\left(\frac{x^{2}}{w_{1x}^{2}} + \frac{y^{2}}{w_{1y}^{2}}\right)\right] \times \exp\left[-jk_{1}\frac{\left(x^{2} + y^{2}\right)}{2R_{1}}\right]$$
(7.1)

 $k_1$ = wave number in the incident medium and  $R_1$ = radius of curvature of the wavefront from the laser source.

Using Gaussian approximation, the fundamental mode  $(\Psi_f)$  of the circular core singlemode optical fiber can written as [25-27],

$$\Psi_f = \exp\left[-\frac{x^2 + y^2}{w_f^2}\right]$$
(7.2)

Here,  $w_f$  represents the spot size for graded index fiber and it can be expressed as [99, 105, 107, 117, 119, 122, 152, 153, 217-220],

$$w_f = a_0 \left[ \frac{A'}{V^{2/(g+2)}} + \frac{B'}{V^{3/2}} + \frac{C'}{V^6} \right]$$
(7.3)

where the term (V) is the normalized frequency and is given by  $V\left(=k_0a_0\sqrt{n_{co}^2-n_{cl}^2}\right)$ ,  $a_0=$  the core radius,  $n_{co} =$  core refractive index, and  $n_{cl} =$  cladding refractive index respectively.

The values of arbitrary constants A', B' and C' can be evaluated (within  $1.5 < V < \infty$ ) by parameter optimization technique and the evaluated values are presented below.

$$A' = \left[\frac{2}{5}\left\{1 + 4\left(\frac{2}{g}\right)^{1/2}\right\}\right]^{1/2}$$
$$B' = e^{0.298/g} - 1 + 1.478\left(1 - e^{-0.077g}\right)$$
$$C' = 3.76 + e^{4.19/g^{0.418}}$$
(7.4)

Here, the profile exponent of the fiber is denoted by (g). The numerical value of (g) =1 in case of triangular index fiber.

The field  $\Psi_{v}$  of the incident laser beam on the face of the fiber end (plane 2 as shown in Fig. 7.1) via upside down tapered hemispherical microlens is given by,

$$\Psi_{v} = \exp\left[-\left(\frac{x^{2}}{w_{2x}^{2}} + \frac{y^{2}}{w_{2y}^{2}}\right)\right] \times \exp\left[-\frac{jk_{2}}{2}\left(\frac{x^{2}}{R_{2x}} + \frac{y^{2}}{R_{2y}}\right)\right]$$
(7.5)

Where the lens transformed spot sizes along X and Y directions are represented by  $w_{2x}$  and  $w_{2y}$ . The radii of curvature along X and Y direction are represented by  $R_{2x}$  and  $R_{2y}$ .

The wave number in the lens medium is  $K_2$  Again, the term  $w_{2x, 2y}$  and  $R_{2x, 2y}$  can found in terms of  $w_{1x, 1y}$ , and  $R_1$  by using the following relations as [122],

$$q_2 = \frac{Aq_1 + B}{Cq_1 + D}$$
(7.6)

The parameters A, B, C and D of the above equation are the elements of the transformation matrix. The other two input and output parameters  $q_1$  and  $q_2$  of the laser beam are given as,

$$\frac{1}{q_{1,2}} = \frac{1}{R_{1,2}} - \frac{j\lambda_0}{\pi w_{1,2}^2 n_{1,2}^2}$$
(7.7)

where, the radius of curvature of the wavefront is represented by (R), the refractive index is denoted by (n), the spot size is represented by (w), and ( $\lambda_0$ ) is the wavelength in free space respectively.

The upside down tapered hemispherical lens on the end face of a triangular index fiber can be represented by the ray transformation matrix is given as [122, 126],

Mismatch Considerations in Laser Diode to Single-Mode Circular Core Triangular Index Fiber.....

$$\begin{pmatrix} A & B \\ C & D \end{pmatrix} = \begin{pmatrix} A_1 & B_1 \\ C_1 & D_1 \end{pmatrix} \begin{pmatrix} 1 & u \\ 0 & 1 \end{pmatrix}$$
(7.8)

Here, the term  $A_1$ ,  $B_1$ ,  $C_1$  and  $D_1$  can be expressed as,

$$A_{1} = r_{2}(z) + \frac{n_{co} - 1}{n_{co}R_{0}}r_{1}(z) ,$$

$$B_{1} = \frac{r_{1}(z)}{n_{1}} ,$$

$$C_{1} = \frac{dr_{2}(z)}{dz} + \frac{n_{co} - 1}{n_{co}R_{0}}\frac{dr_{1}(z)}{dz} ,$$

$$D_{1} = \frac{1}{n_{co}}\frac{dr_{1}(z)}{dz}$$
(7.9)

Here,  $r_1(z)$ ,  $\frac{dr_1(z)}{dz}$ ,  $r_2(z)$  and  $\frac{dr_2(z)}{dz}$  are given as,

$$r_{1}(z) = -\frac{L}{\alpha} \left(1 - \frac{z}{L}\right)^{\frac{1}{2}} \sin k(z) ,$$

$$\frac{dr_{1}(z)}{dz} = \frac{1}{\left(1 - \frac{z}{L}\right)^{\frac{1}{2}}} \left[\cos k(z) + \frac{1}{2\alpha} \sin k(z)\right] ,$$

$$r_{2}(z) = \left(1 - \frac{z}{L}\right)^{\frac{1}{2}} \left[\cos k(z) - \frac{1}{2\alpha} \sin k(z)\right] ,$$

$$\frac{dr_{2}(z)}{dz} = \frac{A^{2}L}{\alpha \left(1 - \frac{z}{L}\right)^{\frac{1}{2}}} \sin k(z)$$

(7.10a)

Further, the parameters k (z),  $\alpha$ , A and z are can be expressed as [122],

$$k(z) = \alpha \ln \left( 1 - \frac{z}{L} \right)$$

$$\alpha = \left(A^2 L^2 - \frac{1}{4}\right)^{\frac{1}{2}}$$
(7.10b)

The lens transformed spot sizes  $(w_{2x, 2y})$  and the radii of curvature  $(R_{2x, 2y})$  can be derived by using (7.6) and (7.7),

$$w_{2x,2y}^{2} = \frac{A_{2}^{2}w_{1x,1y}^{2} + (\lambda_{1}B^{2})/(\pi^{2}w_{1x,1y}^{2})}{n(A_{2}D - BC_{2})}$$
(7.11)

$$\frac{1}{R_{2x,2y}} = \frac{A_2 C_2 w_{1x,1y}^2 + (\lambda_1^2 BD) / (\pi^2 w_{1x,1y}^2)}{A_2^2 w_{1x,1y}^2 + (\lambda_1^2 B^2) / (\pi^2 w_{1x,1y}^2)}$$
(7.12)

where,  $\lambda_1 = \lambda_0 / n_1$ ;  $A_2 = A + B / R_1$ ;  $C_2 = C + D / R_1$ ;  $n = n_2 / n_1$ .

Now, the coupling efficiency ( $\eta$ ) can be articulate as (7.13) by the popular overlap integral as [105, 119, 122, 149, 153],

$$\eta = \frac{\left|\iint \Psi_{\nu} \Psi_{f}^{*} dx dy\right|^{2}}{\iint \left|\Psi_{\nu}\right|^{2} dx dy \iint \left|\Psi_{f}\right|^{2} dx dy}$$
(7.13)

Finally, using (7.2) and (7.5) in (7.13), we get

$$\eta = \frac{4w_{2x}w_{2y}w_{f}^{2}}{\left[\left(w_{f}^{2} + w_{2x}^{2}\right)^{2} + \frac{k_{2}^{2}w_{f}^{4}w_{2x}^{4}}{4R_{2x}^{2}}\right]^{1/2} \times \left[\left(w_{f}^{2} + w_{2y}^{2}\right)^{2} + \frac{k_{2}^{2}w_{f}^{4}w_{2y}^{4}}{4R_{2y}^{2}}\right]^{1/2}}$$
(7.14)

With the aim of evaluating coupling efficiency in presence of transverse mismatch is as shown in Figure 7.2.

In this context, we presume that the center of the fiber is displaced to a point having coordinates  $(d_1, d_2)$  from its original position (0, 0) in the X-Y plane.

The relation between primed and unprimed coordinate is presented below,

$$x = x' + d_1$$
  
 $y = y' + d_2$  (7.15)

Mismatch Considerations in Laser Diode to Single-Mode Circular Core Triangular Index Fiber.....



Figure 7.2: Transverse misalignment between the center of the fiber and imaged laser diode

In this case, the fundamental mode of the fiber can be expressed as [217],

$$\Psi_{f} = \exp\left[-\left\{\frac{\left(x-d_{1}\right)^{2}+\left(y-d_{2}\right)^{2}}{w_{f}^{2}}\right\}\right]$$
(7.16)

The coupling efficiency in presence of transverse mismatch is obtained using (7.5), (7.14) and (7.16) [107, 119],

$$\eta_{t} = \eta exp \left[ \frac{2d_{1}^{2}}{w_{f}^{2}} \left\{ \frac{w_{2x}^{2} (w_{2x}^{2} + w_{f}^{2})}{(w_{f}^{2} + w_{2x}^{2})^{2} + 2(k_{2}^{2} w_{2x}^{4} w_{f}^{4})/4R_{2x}^{2}} - 1 \right\} \right] \times exp \left[ \frac{2d_{2}^{2}}{w_{f}^{2}} \left\{ \frac{w_{2y}^{2} (w_{2y}^{2} + w_{f}^{2})}{(w_{f}^{2} + w_{2y}^{2})^{2} + 2(k_{2}^{2} w_{2y}^{4} w_{f}^{4})/4R_{2y}^{2}} - 1 \right\} \right]$$

$$(7.17)$$

Again, for the angular misalignment, we presume that an angular shift ( $\theta$ ) occurs between hemispherical lens transformed the input face and the end face of the fiber.

The Figure 7.3 shows the angular misalignment  $\theta$  between the upside down tapered hemispherical lenses transformed input and the entrance face of the fiber.

Very small angular mismatch ( $\theta$ ) for lens transformed laser field on the fiber can be estimated as [217],

$$\Psi_{\nu} = exp\left[-\left(\frac{x^{\prime 2}}{w_{2x}^{2}} + \frac{y^{\prime 2}}{w_{2y}^{2}}\right)\right] \times exp\left[-\frac{jk_{2}}{2}\left(\frac{x^{\prime 2}}{R_{2x}} + \frac{y^{\prime 2}}{R_{2y}}\right)\right] \times exp(jk_{2}x^{\prime}\theta)$$
(7.18)



# Figure 7.3: Angular misalignment $\theta$ between the upside down tapered hemispherical lenses transformed input and the entrance face of the fiber

Then, the expression of the fundamental mode of circular core fiber is expressed by,

$$\Psi_f = exp\left[-\frac{x^{'2} + y^{'2}}{w_f^2}\right]$$
(7.19)

Utilizing (7.13), (7.18) and (7.19), we get the following expression of coupling efficiency ( $\eta_a$ ) in the existence of small angular misalignment  $\theta$  [107, 119],

$$\eta_a = \eta \exp\left[-\frac{k_2^2 \theta^2}{2} \left\{ \frac{\left(w_f^2 + w_{2x}^2\right) w_{2x}^2 w_f^2}{\left(w_f^2 + w_{2x}^2\right)^2 + \left(k_2^2 w_{2x}^4 w_f^4\right) / 4R_{2x}^2} \right\} \right]$$
(7.20)

We have analyzed the coupling efficiencies by applying the analytical relations (7.17) and (7.19) respectively for the said type of coupling device.

#### 7.3 Results and Discussions:

This section we elaborate and discuss the obtained results. Two different sources creating wavelength 1.5  $\mu$ m with spot-sizes w<sub>1x</sub>=0.843  $\mu$ m, w<sub>1y</sub>=0.857  $\mu$ m, and another wavelength 1.3  $\mu$ m with spot-sizes w<sub>1x</sub>=1.081  $\mu$ m, w<sub>1y</sub>=1.161  $\mu$ m are used for the examination of the coupling efficiencies for this device in presence of possible transverse and angular misalignments. Here, three different triangular-index fiber of three different V-numbers namely 4.380, 3.511 and 1.924 having respective spot-sizes (w<sub>f</sub>) 2.676  $\mu$ m, 3.238  $\mu$ m, and 9.901  $\mu$ m respectively [122, 155] are chosen for the purpose. In this regard, it is worth mentioning that, the first higher order mode cut-off V value of triangular index fiber is 4.380 [221-223]. Keeping this in mind, we have selected the V-values accordingly. The refractive indices of core and cladding are 1.46 and 1.45 respectively for each fiber and the refractive index of the material of the lens is 1.55 with respect to air medium. Furthermore, in order to explain this, the hemispherical lens having a radius of curvature (R<sub>0</sub>) 90  $\mu$ m and radius of aperture (D) 2.5  $\mu$ m are used. Moreover, designers restrict transverse and angular mismatches within 2  $\mu$ m and 2° respectively. Thus our study of coupling efficiencies for

#### Mismatch Considerations in Laser Diode to Single-Mode Circular Core Triangular Index Fiber.....

transverse and angular mismatches is carried up to  $2 \mu m$  and 2 respectively. As the coupling efficiencies found on the basis of spherical wavefront model actually differ slightly from those calculated on the basis of the planer wavefront model [105, 107, 117, 119, 122, 130, 149, 153, 155], the planer wavefront model for the incident has been used. Along with that, in each case, the distance of the laser diode from the lens is optimized for maximum coupling efficiency. The variation of coupling efficiency in presence of feasible transverse differentiate for three triangular index fibers of different V values namely 4.380, 3.511 and 1.924 at the wavelength of 1.5 µm is shown in Figure 7.4. In this figure, the variation of coupling efficiency with respect to  $d_1$  (considering  $d_2=0 \mu m$ ) is characterized by solid lines while the dotted lines show the variation of coupling efficiency versus d<sub>2</sub> (considering d<sub>1</sub>=0  $\mu$ m). Similarly, the coupling efficiency variation at the wavelength 1.3  $\mu$ m in presence of transverse offset is shown in Figure 7.5 for the said optical fibers. In this figure, the variation of coupling efficiency with respect to  $d_1$  (considering  $d_2=0 \mu m$ ) is represented by solid lines while the dotted lines show the variation of coupling efficiency versus  $d_2$  (considering  $d_1=0$  $\mu$ m). In a similar way, the coupling efficiency variation at the wavelength 1.3  $\mu$ m in presence of transverse offset is shown in Figure 7.5 for the said optical fibers. Efficiency variation curves corresponding to  $d_1$  and  $d_2$  almost overlap on each other as there is a small variation in spot size of a circular core fiber in X and Y-direction for both the wavelengths. The changes of efficiency are established to be smaller when the misalignment value with respect to both the wavelengths is below 1 µm. It is also seen that the disparity of the effect of transverse is very less for the triangular fiber with V-value 1.924 at both the wavelengths. The variation of coupling efficiency for angular misalignment is presented in Figure 7.6(a) and 7.6(b) at two wavelengths 1.5  $\mu$ m and 1.3  $\mu$ m respectively for the same three fibers. It is found that in comparing the tolerance with respect to angular mismatch with that of transverse mismatch, the first is better than the latter; for it is clear from Figure 7.4 and 7.5 that it is good in case of the fiber having V number 1.924 at both the used wavelengths of  $1.5 \,\mu\text{m}$ . But the tolerance with respect to transverse mismatch drops for fibers having V values 3.511 and 4.380, the drop being most at the wavelength of 1.3 µm. Again, it can be concluded from Figure 7.6(a) and 7.6(b), that all the fibers indicate good tolerance with respect to angular mismatches at both the wavelengths. It is also relevant to mention in this context that maximum efficiency demands matching of the fiber spot size with the beam transformed laser spot size. From this point of view, it can easily be justified why the fiber of spot size 1.924 produces maximum efficiency to the extent of 97.31% at the wavelength 1.5  $\mu$ m in absence of a mismatch, while the fiber of spot size 3.511 gives maximum efficiency around 80.25% at the wavelength 1.3 µm in absence of a mismatch. Thus from the present study, it is possible to make a judicious selection of the fiber at the operating wavelength considering its advantage regarding coupling efficiency as well as tolerance with respect to possible mismatches. Accordingly, the designers and packagers dealing with such optical coupler will be benefited from the result. At the same time, ABCD matrix formalism is being also successfully used for the forecast of coupling optics in photonic crystal fiber as well [124, 224, 225]. Generalized ABCD matrix formalism for doing the study of laser resonators and beam propagation has also been added to the literature [226]. The application of ABCD matrix formalism in the context of predicting coupling optics involving upside down tapered hemispherical microlens [123, 129] as well as upside down tapered hyperbolic microlens [227] has enriched the literature a lot. Moreover, the present analysis provides sufficient scope for extension to the study of such coupler which involves fiber of low V number. Such analysis is very important in devices like directional couplers, switches which require evanescent coupling.





Figure 7.4: Coupling efficiency versus d<sub>1</sub> or d<sub>2</sub> for hemispherical microlens on upside down tapered triangular index fibers (V=4.380, 3.511 and 1.294) at emitting wavelength of 1.5 μm



Figure 7.5: Coupling efficiency versus  $d_1$  or  $d_2$  for hemispherical microlens on upside down tapered triangular index fibers (V=4.380, 3.511 and 1.294) at emitting wavelength of 1.3  $\mu$ m



Figure 7.6: (a) Coupling efficiency versus  $\theta$  for hemispherical microlens on upside down tapered triangular index fibers (V=4.380, 3.511 and 1.294) at emitting wavelength 1.5  $\mu$ m, (b) Coupling efficiency versus  $\theta$  for hemispherical microlens on upside down tapered triangular index fibers (V=4.380, 3.511 and 1.294) at emitting wavelength 1.3  $\mu$ m

### 7.4 Chapter Summary:

Based on ABCD matrix formalism and taking consideration of possible transverse and angular offsets, we present the launch optics involving laser diode to a single-mode circular core triangular index fiber coupling via upside-down tapered hemispherical microlens on the tip of the fiber. In this regard, triangular index fibers of three typical V numbers and two excitation wavelengths namely 1.3  $\mu$ m and 1.5  $\mu$ m have been used. It has been noticed that the fiber having V number 1.924 shows large tolerance with respect to both mismatches and it also gives very high efficiency at the wavelength of 1.5  $\mu$ m. The concerned implementation involves little computation. The results found the present sensitivity of such coupler in presence of the said kinds of mismatches and it will enable the designers and packages to sort out judiciously the most effective fiber in terms of maximum coupling efficiency.

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control of Images and Coupling in Fiber Optics Communication

https://www.kdpublications.in

ISBN: 978-93-90847-75-4

## Chapter 8

## **Conclusion and Scope of Future Work**

### 8.1 Conclusion:

Digital data hiding is a well-known method of securing digital multimedia content by means of tamper estimation, authentication, and copyright protection. The data hiding process embeds extra information in the visual content of a digital media object at a cost of loss in visual quality. The method led by this attribute may be used for the quality access control of the digital image.

By viewing the increasing demand for the multimedia security in the age of internet through the world of the WWW, the state-of-the-art technologies for quality access control of the digital image are presented and along with this, the efficient hardware architecture of different access control schemes is realized and tested. The realization of VLSI hardware is implemented in FPGA. The performance analysis of the realized system in terms of resource utilization, throughput and power consumption is demonstrated and discussed. Moreover, all the implemented hardware is optimized to satisfy the area-power-speed trade-off curve used in VLSI designs. It is seen that the implemented technique is superior compared to the other contemporary results of similar research and it is proved that this design has very low power consumption, high throughput, and resource utilization factor.

In Chapter 3, novel hardware architecture of passive data-hiding scheme for quality access control of the digital image of a DCT compressed domain is presented. The main design goal of low power usage, reliability, real-time performance, and ease of integration with existing consumer electronic devices is achieved by efficient hardware architecture design. A high throughput is achieved by parallel hardware architecture. The performance of the architecture was studied by implementing Xilinx FPGA (Zynq device family XC7Z010-CLG400) technology based FPGA.

In Chapter 4, Dither modulation based data hiding technique for the quality access control of digital image in the DCT domain is implemented in reconfigurable field programmable devices. To fulfill the increasing demand for real-time hardware implementation which can be fitted with the existing consumer electronic devices is realized.

The (VLSI) architecture is optimized by parallel processing and is implemented in FPGA to achieve the miscellaneous intents like real-time processing with greater reliability while operating at high speed and at the same time consumes very low power. The results obtained found to be beneficial for the portable consumer electronic gadgets because of its low power consumption. The parallel hardware implementation leads the system to achieve a very high throughput of 1.34 Gigabyte/s for the quality access control of encoder and decoder, respectively at a maximum operating frequency of 131.16 MHz. The hardware system was implemented and tested in Xilinx Virtex 7, XC7VX330T-FFG1157-3.

In Chapter 5, A hardware implementation of a data-hiding technique is proposed for efficient quality access control of images using lifting based DWT. It is evident that the watermarking chip is absolutely necessary for low power, real-time performance, high reliability, low-cost applications, and also for easy to integrate with existing consumer electronic devices. By considering the fact, a field programmable gate array (FPGA) based hardware architecture for real-time access control operation is synthesized. The effectiveness of the hardware is reflected by the simulation result. The key factor that leads the design superior is (a) very low power consumption of 78.48 mW at a higher operating frequency of 130.14 MHz. (b) the elevated throughput of 23.82 Megabyte/s is also achieved. Furthermore, the FPGA resource utilization is very low that leads the design can be fitted in the resource-constrained field programmable device. The results of the access control system hardware are obtained while the implementation was in XILINX Zynq (XC7Z020-CLG484-1) FPGA.

The previously presented work in Chapter 3, Chapter 4, and Chapter 5 illustrate the implementation of image quality access control system (encoder & decoder). The concerned algorithms that have been implemented are all in transform domain technique. Although, the transform domain techniques are more robust but involve a huge computation. From the perspective of VLSI hardware design, it becomes expensive in terms of hardware execution time and as well as the implementation time. This factor leads to an increase in implementation cost. In consideration of the fact we have developed a spatial domain technique for quality access control of the image. The special domain technique is computationally simple and easy to implement in VLSI hardware.

In Chapter 6, we have developed a novel QIM data hiding scheme. The algorithm was mainly developed for the purpose of quality access control of the digital image. The performance analysis of the encoding and decoding algorithm was tested and evaluates using MATLAB 14.2. The result proves that the scheme is robust for the wide range of common image processing operations and attacks especially image cropping. It is pertinent to say that the encoding and decoding scheme is simple with reduced computational complexity makes the scheme an attractive choice for its hardware implementation. The proposed quality access control algorithm is implemented in FPGA and the results are compared with the related scheme and found to be superior. The superiority measure is in terms of throughput and power consumption. The high average throughput of 119.294 Megabyte/s is achieved for the access control system. The architectural implementation of the scheme follows several low power design strategy like selective activation of the circuit block, pipelined architecture to minimize the power consumption.

Microwave photonics and information optics are emerging areas of research in the context of increasing demand of high bandwidth. To implement things with ultrafast speed, optoelectronics devices are being replaced by photonic devices. Thus, March towards alloptical technology has generated tremendous contemporary interest. Accordingly, the study of coupling optics regarding laser diode to fiber coupling is of immense importance in the domain of integrated optics. In Chapter 7, the relevant ABCD matrix for upside-down tapered hemispherical microlens on the tip of triangular index fiber has been employed for study of coupling efficiency of a laser diode to a single-mode circular core triangular index fiber coupling via upside-down tapered hemispherical microlens on the tip of the fiber in presence of possible transverse and angular offsets. The results obtained will prove beneficial to the designers and packagers working in the field of optimum launch optics. In this context, triangular index fibers of three typical V numbers exposed to two excitation wavelengths namely 1.3  $\mu$ m and 1.5  $\mu$ m, have been studied. It is found that the fiber having V number 1.924 shows large tolerance with respect to both mismatches and it also shows very high coupling efficiency at the wavelength of 1.5  $\mu$ m. The concerned execution of ABCD matrix formalism involves little computation. The results present sensitivity of such coupler in presence of the said kinds of misalignments and the results can guide the designers and packages in respect of judicious selection of the most effective fiber in terms of maximum coupling efficiency.

Further, a triangular-index fiber enjoys the merit of being the simplest kind of dispersionshifted fiber and also having a low value of the micro and macro-bending loss. But the main disadvantage of this fiber is its low value of first higher order mode cut-off wavelength which ranges between 0.85  $\mu$ m and 0.90  $\mu$ m. The choice of the operating wavelength has to be made far away from the cut-off wavelength for such types of fibers. This leads to the choice of two operating wavelengths as 1.3  $\mu$ m and 1.5 $\mu$ m.

It is relevant to mention that the present investigation can generate motivation for application of the prescribed formalism in different domains of optical technology. In fact, literature involving coupling optics is being continuously enriched by reports of investigations on various kinds of microlenses from different laboratories. The present formalism of coupling optics leaves ample scope for extension in respect of study concerned with such coupler comprising photonic crystal fibers as well as multi-core fibers. In this respect, it is desirable that one should maintain track with the relevant publications from places where needful facilities for such investigations are available. This will lead to the formulation of simple but accurate model related to the prediction of coupling optics concerned with fibers of different refractive index profiles as well as lenses of novel design.

## 8.2 The Scope of Future Work:

The present book describes the implementation of FPGA based hardware of few data hiding algorithms for quality access control of grayscale image to achieve high throughput, low power consumption at a cost of less resource utilization. FPGA based reconfigurable hardware design of access control algorithm spans over wide varieties of approaches and applications, such as from simple and low computational cost based spatial domain implementation of QIM watermarking technique to computationally complex transform domain (DCT and DWT) approach, fragile to robust high payload QIM watermarking methods using DCT and lifting based wavelets. Due to the limited scope of the book, we could not able to address all the issues related to the algorithm and its hardware architecture with equal extent.

In this book, we have restricted ourselves to design reconfigurable hardware of various data hiding algorithms for grayscale images. In the modern age, the huge number of color HD (high definition) image, video, and musical content are parsimoniously disseminated through a network. Real-time hardware-based access control for this kind of archives is a significant issue. A few algorithms are considered for the FPGA based hardware implementation as presented in this book.

Those algorithms may be modified to fit the necessity for a color image, video, and musical archives such that the cost-effective hardware can be made. Side by side to develop realtime consumable hardware realization faces various design challenge to minimize power consumption and resource utilization with higher throughput. To solve the above-mentioned issues an intensive research is necessary.

Furthermore, the power consumption with respect to the real-time hardware implementation is a major issue. VLSI design engineer has widespread research area to mitigate the power consumption issue. One possible measure may be the ASIC implementation to resolve this problem, which may be the next research objective. Further, optical coupling of a laser diode with fiber should have large coupling efficiency together with nice tolerance with respect to practical misalignment.

The use of ABCD matrix has simplified prediction of coupling optics. The concerned prediction also possesses sufficient accuracy. Thus this accurate technique will be user-friendly. In fact, this has motivated us to apply ABCD matrix formalism for prediction of coupling optics involving laser diode to single-mode circular core fiber coupling via upside down tapered hemispherical microlens on the fiber tip.

Side by side, different kinds of fibers as well as different kinds of microlenses are being reported continuously from laboratories where needful facilities are available. Thus one should be motivated to formulate an ABCD matrix for a particular kind of optical coupler and employ it to predict the associated coupling optics.

Conclusively, there is ample scope for the formulation of the ABCD matrix for different types of micro as well as tapered lenses fabricated on different kinds of fibers. The formulated ABCD matrix can be employed for investigation of coupling optics related to the concerned coupler. Needful verification of the accuracy of the formalism can be made by comparing the found results with the available experimental results or with the results obtained rigorously by applying phase matching technique. It is relevant to mention in this connection that ABCD matrix formalism can similarly take care of inherent in circularity of the fiber core.

ISBN: 978-93-90847-75-4

## References

- 1. Raphaël Grosbois, Pierre Gerbelot, and Touradj Ebrahimi, "Authentication and access control in the JPEG 2000 compressed domain", Applications of Digital Image Processing, 4472, 95-105, 2001.
- 2. Elisa Bertino, Barbara Catania, Elena Ferrari, and Paolo Perlasca, "A logical framework for reasoning about access control models", ACM Transactions on Information and System Security, 6, 71-127, 2003.
- 3. Masaaki Fujiyoshi, Shoko Imaizumi, and Hitoshi Kiya, "Encryption of Composite Multimedia Contents for Access Control", IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E90-A, 590-596, 2007.
- 4. Jiang-Lung Liu, "Efficient selective encryption for JPEG 2000 images using private initial table", Pattern Recognition, 39, 1509-1517, 2006.
- 5. Lino Coria, Panos Nasiopoulos, Rabab Ward, and Mark Pickering, "An access control video watermarking method that is robust to geometric distortions", Journal of Information Assurance and Security, 2, 266-274, 2007.
- 6. Feng-Cheng Chang, Hsiang-Cheh Huang, and Hsueh-Ming Hang, "Layered access control schemes on watermarked scalable media", The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, 49, 443-455, 2007.
- 7. Andrew Kingston and Florent Autrusseau, "Lossless image compression via predictive coding of discrete Radon projections ", Signal Processing: Image Communication, 23, 313-324, 2008.
- 8. Lino E. Coria, Mark R. Pickering, Panos Nasiopoulos and Rabab Kreidieh Ward, "A Video Watermarking Scheme Based on the Dual-Tree Complex Wavelet Transform", IEEE transactions on information forensics and security, 3, 466-474, 2008.
- 9. Giakoumaki Aggeliki, Sotiris Pavlopoulos, and Dimitris Koutsouris, "Secure and efficient health data management through multiple watermarking on medical images", Medical and Biological Engineering and Computing, 44, 619-631, 2006.
- 10. Anurag Mishra, Aruna Jain, Manish Narwaria, and Charu Agarwal, "An experimental study into objective quality assessment of watermarked images", International Journal of Image Processing, 5, 199-219, 2011.
- 11. Amit Phadikar, Santi P Maity, and Malay K Kundu, "Quantization based data hiding scheme for efficient quality access control of images using DWT via lifting", Computer Vision, Graphics & Image Processing, 1, 265-272, 2008.
- 12. Amit Phadikar and Santi P Maity, "ROI based quality access control of compressed color image using DWT via lifting", Electronic Letters on Computer Vision and Image Analysis, 8, 51-67, 2009.
- 13. Amit Phadikar and Santi P Maity, "Quality access control of compressed color images using data hiding", AEU-International Journal of Electronics and Communications, 64, 833-843, 2010.
- 14. Amit Phadikar and Santi P Maity, "Data hiding based quality access control of digital images using adaptive QIM and lifting", Signal Processing: Image Communication, 26, 646-661, 2011.
- 15. Nikos Passas, Sarantis Paskalis, Dimitra Vali, and Lazaros Merakos, "Quality-ofservice oriented medium access control for wireless ATM networks", IEEE Communications Magazine, 35, 42-50, 1997.

- 16. Louis Coetzee and Elizabeth C Botha, "Fingerprint recognition in low quality images", Pattern recognition, 26, 1441-1460, 1993.
- 17. John S Hendricks, John S McCoskey, and Michael Asmussen, "Apparatus for video access and control over computer network, including image correction", U.S. Patent, 6,675,386, 2004.
- 18. Jeng-Shyang Pan, Hsiang-Cheh Huang, and Lakhmi C Jain, "Intelligent watermarking techniques", 7, World scientific, 2004.
- 19. Stefan Katzenbeisser and Fabien Petitcolas, "Information hiding techniques for steganography and digital watermarking", Artech house, 2000.
- 20. Michael Konrad Arnold, Martin Schmucker, and Stephen D Wolthusen, "Techniques and applications of digital watermarking and content protection", Artech House, 2003.
- 21. Amit Phadikar, Santi P Maity, and Mrinal Mandal, "Novel wavelet-based QIM data hiding technique for tamper detection and correction of digital images", Journal of Visual Communication and Image Representation, 23, 454-466, 2012.
- 22. Amit Phadikar, "Multibit quantization index modulation: A high-rate robust datahiding method", Journal of King Saud University-Computer and Information Sciences, 25, 163-171, 2013.
- 23. Sushmita Ruj, Milos Stojmenovic, and Amiya Nayak, "Decentralized access control with anonymous authentication of data stored in clouds", IEEE transactions on parallel and distributed systems, 25, 384-394, 2014.
- 24. Zheng Yan, Xueyun Li, Mingjun Wang, and Athanasios V Vasilakos, "Flexible data access control based on trust and reputation in cloud computing", IEEE Transactions on Cloud Computing, 5, 485-498, 2017.
- 25. Frank Wang, James Mickens, Nickolai Zeldovich, and Vinod Vaikuntanathan, "Sieve: Cryptographically Enforced Access Control for User Data in Untrusted Clouds", Networked Systems Design and Implementation, 16, 611-626, 2016.
- 26. Minhaj AhmadKhan, "A survey of security issues for cloud computing", Journal of network and computer applications, 71, 11-29, 2016.
- 27. Bassam J. Mohd, Thaier Hayajneh, Zaid Abu Khalaf, and Athanasios V. Vasilakos "A comparative study of steganography designs based on multiple FPGA platforms", International Journal of Electronic Security and Digital Forensics 8, 164-190, 2016.
- 28. Sundararaman Rajagopalan, Pakalapati JS Prabhakar, Mucherla Sudheer Kumar, NVM Nikhil, Har Narayan Upadhyay, JBB Rayappan, and Rengarajan Amirtharajan, "MSB based embedding with integrity: An adaptive RGB Stego on FPGA platform", Information Technology Journal, 13, 1945-1952, 2014.
- 29. B. Zaidan and A. A. Zaidan, "Software and hardware FPGA-based digital watermarking and steganography approaches: Toward new methodology for evaluation and benchmarking using multi-criteria decision-making techniques", Journal of Circuits, Systems and Computers, 26, 1750116, 2017.
- 30. S Raveendra Reddy and SM Sakthivel, "A FPGA implementation of data hiding using LSB matching method", International Journal of Research in Engineering & Technology, 4,194-198, 2015.
- 31. Ingemar J Cox and Matt L Miller, "Review of watermarking and the importance of perceptual modeling", Human Vision and Electronic Imaging II, 3016, 92-100, 1997.
- 32. Frank Y. Shih, "Digital watermarking and steganography: fundamentals and techniques", CRC Press, 2017. (ISBN: 9781498738767).
- 33. Chris Honsinger, "Digital watermarking", Journal of Electronic Imaging, 11, 414, 2002.

- 34. Juergen Seitz, "Digital watermarking for digital media", IGI Global, 2005. (ISBN10: 1591405181).
- 35. Artz Donovan, "Digital steganography: hiding data within data", IEEE Internet computing, 5, 75-80, 2001.
- 36. Joceli Mayer, Paulo VK Borges, and Steven J Simske, "Fundamentals and Applications of Hardcopy Communication", Springer, 2018. (ISBN: 978-3-319-74082-9)
- 37. L Robert and T Shanmugapriya, "A study on digital watermarking techniques", International journal of Recent trends in Engineering, 1, 223-225, 2009.
- 38. Gary C. Kessler and Chet Hosmer, "An overview of steganography", Advances in Computers, 83, 51-107, 2011.
- 39. Ahmet M Eskicioglu and Edward J Delp, "An overview of multimedia content protection in consumer electronics devices", Signal Processing: Image Communication, 16, 681-699, 2001.
- 40. Christine I Podilchuk and Edward J Delp, "Digital watermarking: algorithms and applications", IEEE signal processing Magazine, 18, 33-46, 2001.
- 41. Ton Kalker, Geert Depovere, Jaap Haitsma, and Maurice JJJB Maes, "Video watermarking system for broadcast monitoring", Security and Watermarking of Multimedia contents, 3657, 103-113, 1999.
- 42. Lian Shiguo, Handbook of research on secure multimedia distribution. IGI Global, 2009. (ISBN: 978-1-60566-262-6).
- 43. Xiaojun Qi and Xing Xin, "A singular-value-based semi-fragile watermarking scheme for image content authentication with tamper localization", Journal of Visual Communication and Image Representation, 30, 312-327, 2015.
- 44. Nassiri Boujemaa, EL Yousef, Latif Rachid, and Bsiss Mohammed Aziz, "Fragile watermarking of medical image for content authentication and security", International Journal of Computer Science and Network, 5, 734-740, 2016.
- 45. Rayachoti Eswaraiah and Edara Sreenivasa Reddy, "Robust medical image watermarking technique for accurate detection of tampers inside region of interest and recovering original region of interest", IET image Processing, 9, 615-625, 2015.
- 46. Amit Kumar Singh, "Improved hybrid algorithm for robust and imperceptible multiple watermarking using digital images", Multimedia Tools and Applications, 76, 8881-8900, 2017.
- 47. Mohammad Ali Nematollahi, Chalee Vorakulpipat, and Hamurabi Gamboa Rosales, Preliminary on Watermarking Technology, in Digital Watermarking, Springer, 2017. (ISBN: 978-981-10-2094-0)
- 48. A.S. Kapse1, Sharayu Belokar, Yogita Gorde, Radha Rane, and Shrutika Yewtkar"Digital Image Security Using Digital Watermarking", International Research Journal of Engineering and Technology, 5, 163-166, 2018.
- 49. Asaad F Qasim, Farid Meziane, and Rob Aspin, "Digital watermarking: applicability for developing trust in medical imaging workflows state of the art review", Computer Science Review, 27, 45-60, 2018.
- L. De Strycker, P. Termont, J. Vandewege, J. Haitsma, A. Kalker, M. Maes, and G. Depovere, "Implementation of a real-time digital watermarking process for broadcast monitoring on a TriMedia VLIW processor", IEE Proceedings-Vision, Image and Signal Processing, 147, 371-376, 2000.
- 51. Nebu John Mathai, Deepa Kundur, and Ali Sheikholeslami, "Hardware implementation perspectives of digital video watermarking algorithms", IEEE Transactions on Signal Processing, 51, 925-938, 2003.

- 52. Fernando Pérez-González and Mosquera Carlos, "Quantization-based data hiding robust to linear-time-invariant filtering", IEEE Transactions on Information Forensics and Security 3, 137-152, 2008.
- 53. Asifullah Khan and Sana Ambreen Malik, "A high capacity reversible watermarking approach for authenticating images: exploiting down-sampling, histogram processing, and block selection", Information Sciences, 256, 162-183, 2014.
- 54. Jian-Ruei Chen, Paul C-P Chao, Che-Hung Tsai, and Wei-Dar Chen, "Design and realization of a high resolution (640× 480) SWIR image acquisition system", Microsystem Technologies, 20, 1583-1595, 2014.
- 55. Wei Sun, Zhe-Ming Lu, Yu-Chun Wen, Fa-Xin Yu, and Rong-Jun Shen, "High performance reversible data hiding for block truncation coding compressed images", Signal, Image and Video Processing, 7, 297-306, 2013.
- 56. Cheonshik Kim, Dongkyoo Shin, Lu Leng, and Ching-Nung Yang, "Lossless data hiding for absolute moment block truncation coding using histogram modification", Journal of Real-Time Image Processing, 14, 101-114, 2018.
- 57. Ramesh Karri, Kaijie Wu, Piyush Mishra, and Yongkook Kim, "Concurrent error detection schemes for fault-based side-channel cryptanalysis of symmetric block ciphers", IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 21, 1509-1517, 2002.
- 58. Shaowei Weng, JS Pan, and X Gao, "Reversible watermark combining pre-processing operation and histogram shifting", Journal of Information Hiding and Multimedia Signal Processing, 3, 320-326, 2012.
- 59. Shabir A Parah, Javaid A Sheikh, Nazir A Loan, and Ghulam M Bhat, "Robust and blind watermarking technique in DCT domain using inter-block coefficient differencing", Digital Signal Processing, 53, 11-24, 2016.
- 60. Chun-Chi Lo, Yu-Chen Hu, Wu-Lin Chen, and Chang-Ming Wu, "Reversible data hiding scheme for BTC-compressed images based on histogram shifting", International Journal of Security and its Applications, 8, 301-314, 2014.
- 61. Vaibhav B Joshi, Mehul S Raval, Dhruv Gupta, Priti P Rege, and S K Parulkar, "A multiple reversible watermarking technique for fingerprint authentication", Multimedia Systems, 22, 367-378, 2016.
- 62. Younes Terchi and Saad Bouguezel, "A blind audio watermarking technique based on a parametric quantization index modulation", Multimedia Tools and Applications, 1-28, 2018.
- 63. Sengul Dogan, "A reversible data hiding scheme based on graph neighbourhood degree", Journal of Experimental & Theoretical Artificial Intelligence, 29, 741-753, 2017.
- 64. Amit Phadikar, Himadri Mandal, Goutam Kr Maity, and Tien-Lung Chiu, "A new model of QIM data hiding for quality access control of digital image", International Conference on Soft-Computing and Networks Security (ICSNS), 1-5, 2015.
- 65. Mustafa Osman Ali, Elamir Abu Abaida Ali Osman, and Rameshwar Row, "Invisible digital image watermarking in spatial domain with random localization", International Journal of Engineering and Innovative Technology, 2, 227-231, 2012.
- 66. Yanyan Xu, Lizhi Xiong, Zhengquan Xu and Shaoming Pan, "A content security protection scheme in JPEG compressed domain", Journal of Visual Communication and Image Representation, 25, 805-813, 2014.
- 67. P. Deepa and C. Vasanthanayaki, "Image coding using lapped biorthogonal transform", Signal, Image and Video Processing, 7, 879-888, 2013.

- 68. M. Ferretti and D. Rizzo, "A Parallel Architecture for the 2-D Discrete Wavelet Transform with Integer Lifting Scheme", Journal of VLSI signal processing systems for signal, image and video technology, 28, 165-185, 2001.
- 69. Saraju P Mohanty, Nagarajan Ranganathan, and Karthikeyan Balakrishnan, "A dual voltage-frequency VLSI chip for image watermarking in DCT domain", IEEE Transactions on Circuits and Systems II: Express Briefs, 53, 394-398, 2006.
- Jiliang Zhang and Lele Liu, "Publicly verifiable watermarking for intellectual property protection in FPGA design", IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 25, 1520-1527, 2017.
- 71. Santi P Maity, and Malay K Kundu, "Distortion free image-in-image communication with implementation in FPGA", AEU-International Journal of Electronics and Communications, 67, 438-447, 2013.
- 72. Med Lassaad Kaddachi, Adel Soudani, Vincent Lecuire, Kholdoun Torki, Leila Makkaoui, and Jean-Marie Moureaux, "Low power hardware-based image compression solution for wireless camera sensor networks", Computer Standards & Interfaces, 34, 14-23, 2012.
- 73. Saraju P Mohanty, Elias Kougianos, and Nagarajan Ranganathan, "VLSI architecture and chip for combined invisible robust and fragile watermarking", IET Computers & Digital Techniques, 1, 600-611, 2007.
- 74. P Karthigaikumar and K Baskaran, "FPGA and ASIC implementation of robust invisible binary image watermarking algorithm using connectivity preserving criteria", Microelectronics Journal, 42, 82-88, 2011.
- 75. Santi P Maity, Malay K Kundu, and Seba Maity, "Dual purpose FWT domain spread spectrum image watermarking in real time", Computers & Electrical Engineering, 35, 415-433, 2009.
- 76. Hirak Kumar Maity and Santi P Maity, "FPGA implementation of reversible watermarking in digital images using reversible contrast mapping", Journal of Systems and Software, 96, 93-104, 2014.
- Hirak Kumar Maity and Santi P Maity, "FPGA Implementation for Modified RCM-RW on Digital Images", Journal of Circuits, Systems and Computers, 26, 1750044, 2017.
- 78. Saraju P Mohanty and Elias Kougianos, "Real-time perceptual watermarking architectures for video broadcasting", Journal of Systems and Software, 84, 724-738, 2011.
- 79. Anand D Darji, TC Lad, Shabbir N Merchant, and Arun N Chandorkar, "Watermarking hardware based on wavelet coefficients quantization method", Circuits, Systems, and Signal Processing, 32, 2559-2579, 2013.
- Krill, A. Ahmad, A. Amira and H. Rabahd, "An efficient FPGA-based dynamic partial reconfiguration design flow and environment for image and signal processing IP cores", Signal Processing: Image Communication, 25, 377-387, 2010.
- 81. Subhajit Das, Reshmi Maity, and N P Maity, "VLSI-Based Pipeline Architecture for Reversible Image Watermarking by Difference Expansion with High-Level Synthesis Approach", Circuits, Systems, and Signal Processing, 37, 1575-1593, 2018.
- 82. Sambaran Hazra, Sudip Ghosh, Sayandip De, and Hafizur Rahaman, "FPGA implementation of semi-fragile reversible watermarking by histogram bin shifting in real time", Journal of Real-Time Image Processing, 14, 193-221, 2018.

- 83. Sonjoy Deb Roy, Xin Li, Yonatan Shoshan, Alexander Fish, and Orly Yadid-Pecht, "Hardware implementation of a digital watermarking system for video authentication", IEEE Transactions on Circuits and Systems for Video Technology, 23, 289-301, 2013.
- 84. B Zaidan, and A Zaidan, "Software and hardware FPGA-based digital watermarking and steganography approaches: Toward new methodology for evaluation and benchmarking using multi-criteria decision-making techniques", Journal of Circuits, Systems and Computers, 26, 1750116, 2017.
- 85. Zhongxun Wang and Tiantian Tang, "Design of LDPC Decoder Based On FPGA in Digital Image Watermarking Technology", Telkomnika (Telecommunication Computing Electronics and Control), 15, 457-463, 2017.
- Jacques M Bahi, Xiaole Fang, Christophe Guyeux, and Laurent Larger, "FPGA Design for Pseudorandom Number Generator Based on Chaotic Iteration used in Information Hiding Application ", Applied Mathematics & Information Sciences, 7, 2175-2188, 2013.
- Ziyu Yang and Xiaodong Hu, "Improved digital watermarking method and realization based on FPGA", Optoelectronic Detection Technology and Application, 10697, 106971Q-1, 2018.
- 88. ManasRanjan Nayak, Joyashree Bag, Souvik Sarkar and Subir Kumar Sarkar,"Hardware implementation of a novel water marking algorithm based on phase congruency and singular value decomposition technique", AEU-International Journal of Electronics and Communications 71, 1-8, 2017.
- 89. Nisreen I. R. Yassin, "Digital watermarking for telemedicine applications: a review", International Journal of Computer Applications, 129, 30-37, 2015.
- 90. Lu-Ting Ko, Jwu-E Chen, Yaw-Shih Shieh, Massimo Scalia, and Tze-Yun Sung, "A novel fractional-discrete-cosine-transform-based reversible watermarking for healthcare information management systems", Mathematical Problems in Engineering, 2012, 1-17, 2012.
- 91. Shih-Ching Our, Hung-Yuan ChungWen and Tsai Sung, "Improving the compression and encryption of images using FPGA-based cryptosystems", Multimedia Tools and Applications, 28, 5-22, 2006.
- 92. S. M. Sakthivel and A. Ravi Sankar, "An ASIC based invisible watermarking of grayscale images using pixel value search algorithm (PVSA)", Multimedia Tools and Applications, 77, 26793-26819, 2018.
- 93. Andrzej Głowacz and Marcin Pietroń, "Implementation of Digital Watermarking Algorithms in Parallel Hardware Accelerators", International Journal of Parallel Programming, 45, 1108-1127, 2017.
- 94. D Naresh Kumar, V Arun, T Ravinder, and T Nagarjuna, "A Novel Approach for Image Integrity Protection on FPGA", Journal of Advanced Research in Dynamical and Control Systems, 1238-1241, 2018.
- 95. Pierre Greisen, Simon Heinzle, Markus Gross and Andreas P Burg, "An FPGA-based processing pipeline for high-definition stereo video", EURASIP Journal on Image and Video Processing, 1, 1-13, 2011.
- 96. Robert Ladig, Suphachart Leewiwatwong and Kazuhiro Shimonomura, "FPGA-Based Fast Response Image Analysis for Orientational Control in Aerial Manipulation Tasks", Journal of Signal Processing Systems, 90, 901-911, 2017.
- 97. Altaf O Mulani, and P B Mane, "Watermarking and Cryptography Based Image Authentication on Reconfigurable Platform", Bulletin of Electrical Engineering and Informatics, 6, 181-187, 2017.

- 98. Liang Ye, "FPGA and DSP Based Digital Watermark Detector is Designed", Automation & Instrumentation, 6, 015, 2016.
- 99. E G Neumann, "Single-Mode Fiber Fundamentals", 5, Springer, 1988. (ISBN: 978-3-540-48173-7).
- 100. L a Wang and C D Su, "Tolerance analysis of aligning an astigmatic laser diode with a single-mode optical fiber", Journal of Light wave Technology, 14, 2757-2762, 1996.
- 101. H Ghafoori-Shiraz and T Asano, "Microlens for coupling a semiconductor laser to a single-mode fiber", Optics Letters, 11, 537-539, 1986.
- 102. H M Presby and C. A. Edwards, "Near 100% efficient Fibre microlenses", Electronics Letters, 28, 582-584, 1992.
- 103. H M Presby and C. A. Edwards, "Efficient coupling of polarization-maintaining fiber to laser diodes", IEEE Photonics Technology Letters, 4, 897-899, 1992.
- 104. J John, T S M Maclean, N Ghafouri-Shiraz, and John Niblett, "Matching of singlemode fibre to laser diode by microlenses at 1.5 mu m wavelength", IEE Proceedings-Optoelectronics, 141, 178-184, 1994.
- 105. Sankar Gangopadhyay and Somenath Sarkar, "ABCD matrix for reflection and refraction of Gaussian light beams at surfaces of hyperboloid of revolution and efficiency computation for laser diode to single-mode fiber coupling by way of a hyperbolic lens on the fiber tip", Applied Optics, 36, 8582-8586, 1997.
- 106. S Gangopadhyay and S N Sarkar, "Laser diode to single-mode fiber excitation via hemispherical lens on the fiber tip: efficiency computation by ABCD matrix with consideration for allowable aperture", Journal of Optical Communications, 19, 42-44, 1998.
- 107. S Gangopadhyay and S N Sarkar, "Misalignment considerations in laser diode to single-mode fiber excitation via hemispherical lens on the fiber tip", Journal of Optical Communications, 19, 217-221, 1998.
- 108. S Gangopadhyay and S N Sarkar, "Misalignment considerations in laser diode to single-mode fibre excitation via hyperbolic lens on the fibre tip", Optics Communications, 146, 104-108, 1998.
- 109. Samir Kumar Mondal and Somenath Sarkar, "Coupling of a laser diode to singlemode fiber with an upside-down tapered lens end", Applied Optics, 38, 6272-6277, 1999.
- 110. Faidz Abd Rahman, Kenzo Takahashi, and Chuah Hean Teik, "A scheme to improve the coupling efficiency and working distance between laser diode and single mode fiber", Optics Communications, 208, 103-110, 2002.
- 111. Liu Hongzhan, Liu Liren, Xu Rongwei, and Luan Zhu, "Simple ABCD matrix method for evaluating optical coupling system of laser diode to single-mode fiber with a lensedtip", Optik-International Journal for Light and Electron Optics, 116, 415-418, 2005.
- 112. Kumaran Sambanthan and Faidz Abdul Rahman, "Method to improve the coupling efficiency of a hemispherically lensed asymmetric tapered-core fiber", Optics Communications, 254, 112-118, 2005.
- 113. MC Kundu and S Gangopadhyay, "Laser diode to monomode elliptic core fiber excitation via hemispherical lens on the fiber tip: efficiency computation by ABCD matrix with consideration for allowable aperture", Optik-International Journal for Light and Electron Optics, 117, 586-590, 2006.
- 114. Sumanta Mukhopadhyay, Sankar Gangopadhyay, and Somenath N Sarkar, "Coupling of a laser diode to a monomode elliptic-core fiber via a hyperbolic microlens on the

fiber tip: efficiency computation with the ABCD matrix", Optical Engineering, 46, 025008, 2007.

- 115. P Patra, S Gangopadhyay, and K Goswami, "Mismatch considerations in laser diode to monomode fiber excitation via a hemispherical lens on the elliptical core fiber tip", Optik-International Journal for Light and Electron Optics, 119, 596-600, 2008.
- 116. Hongzhan Liu, "The approximate ABCD matrix for a parabolic lens of revolution and its application in calculating the coupling efficiency", Optik-International Journal for Light and Electron Optics, 119, 666-670, 2008.
- 117. Sumanta Mukhopadhyay, S Gangopadhyay, and S N Sarkar, "Coupling of a laser diode to monomode elliptic core fiber via upside down tapered microlens on the fiber tip: Estimation of coupling efficiency with consideration for possible misalignments by ABCD matrix formalism", Optik-International Journal for Light and Electron Optics, 121, 142-150, 2010.
- 118. Jin Huang and HuaJun Yang, "ABCD matrix model of quadric interface-lensed fiber and its application in coupling efficiency calculation", Optik-International Journal for Light and Electron Optics, 121, 531-534, 2010.
- 119. Sumanta Mukhopadhyay and Somenath N Sarkar, "Coupling of a laser diode to single mode circular core graded index fiber via hyperbolic microlens on the fiber tip and identification of the suitable refractive index profile with consideration for possible misalignments", Optical Engineering, 50, 045004, 2011.
- 120. Dipankar Kundu and Somenath N Sarkar, "Prediction of propagation characteristics of photonic crystal fibers by a simpler, more complete and versatile formulation of their effective cladding indices", Optical Engineering, 53, 056111, 2014.
- 121. Sumanta Mukhopadhyay, "Laser diode to circular core graded index single mode fiber excitation via upside down tapered microlens on the fiber tip and identification of the suitable refractive index profile", Journal of Physical Science, 20, 173-187, 2015.
- 122. Shubhendu Maiti, Anup Kumar Maiti, and Sankar Gangopadhyay, "Laser diode to single-mode triangular-index fiber excitation via upside down hemispherical microlens on the fiber tip: Prescription of ABCD matrix of transmission and estimation of coupling efficiency", Optik-International Journal for Light and Electron Optics, 144, 481-489, 2017.
- 123. Angshuman Majumdar, Chintan Kumar Mandal, and Sankar Gangopadhyay, "Laser Diode to Single-Mode Circular Core Parabolic Index Fiber Coupling via Upside-Down Tapered Hyperbolic Microlens on the Tip of the Fiber: Prediction of Coupling Optics by ABCD Matrix Formalism", Journal of Optical Communications, 1-10, 2017.
- 124. Sumanta Mukhopadhyay, "Efficient coupling of a laser diode to a parabolic microlens tipped circular core photonic crystal fiber using ABCD matrix formalism with consideration for possible misalignments", Journal of Optics, 47, 47-60, 2018.
- 125. L B Yuan and R L Shou, "Formation and power distribution properties of an upside down taper lens at the end of an optical fiber", Sensors and Actuators A: Physical, 23, 1158-1161, 1990.
- 126. Libo Yuan and Anping Qui, "Analysis of a single-mode fiber with taper lens end", Journal of the Optical Society of America A: Optics, Image Science, and Vision, 9, 950-952, 1992.
- 127. Samir Kumar Mondal, Sankar Gangopadhyay, and Somenath Sarkar, "Analysis of an upside-down taper lens end from a single-mode step-index fiber", Applied Optics, 37, 1006-1009, 1998.
- 128. HL An, "Theoretical investigation on the effective coupling from laser diode to tapered lensed single-mode optical fiber", Optics Communications, 181, 89-95, 2000.
- 129. Bishuddhananda Das, Anup Kumar Maiti, and Sankar Gangopadhyay, "Laser diode to single-mode circular core dispersion-shifted/dispersion-flattened fiber excitation via hyperbolic microlens on the fiber tip: Prediction of coupling efficiency by ABCD matrix formalism", Optik-International Journal for Light and Electron Optics, 125, 3277-3282, 2014.
- 130. Bishuddhananda Das, Tapas Ranjan Middya, and Sankar Gangopadhyay, "Mismatch considerations in excitation of single-mode circular core parabolic index fiber by laser diode via upside down tapered hemispherical microlens on the tip of the fiber", Journal of Optical Communications, 38, 375-382, 2017.
- 131. Hideo Kuwahara, M Sasaki, and N Tokoyo, "Efficient coupling from semiconductor lasers into single-mode fibers with tapered hemispherical ends", Applied Optics, 19, 2578-2583, 1980.
- 132. Hideo Kuwahara, Yoshihito Onoda, Masami Goto, and Takakiyo Nakagami, "Reflected light in the coupling of semiconductor lasers with tapered hemispherical end fibers", Applied Optics, 22, 2732-2738, 1983.
- 133. H Sakaguchi, N Seki, and S Yamamoto, "Power coupling from laser diodes into single-mode fibres with quadrangular pyramid-shaped hemiellipsoidal ends", Electronics Letters, 17, 425-426, 1981.
- 134. Masao Kawachi, Takao Edahiro, and H Toba, "Microlens formation on VAD singlemode fibre ends", Electronics Letters, 18, 71-72, 1982.
- 135. G Eisenstein and D Vitello, "Chemically etched conical microlenses for coupling single-mode lasers into single-mode fibers", Applied Optics, 21, 3470-3474, 1982.
- 136. H Ghafoori-Shiraz, "Experimental investigation on coupling efficiency between semiconductor laser diodes and single-mode fibres by an etching technique", Optical and quantum electronics, 20, 493-500, 1988.
- 137. A Kotsas, H Ghafouri-Shiraz, and TSM Maclean, "Microlens fabrication on singlemode fibres for efficient coupling from laser diodes", Optical and Quantum Electronics, 23, 367-378, 1991.
- 138. Herman M Presby, A F Benner, and C A Edwards, "Laser micromachining of efficient fiber microlenses", Applied Optics, 29, 2692-2695, 1990.
- 139. Christopher A Edwards and Herman M Presby, "Coupling of optical devices to optical fibers by means of microlenses", U.S. Patent No. 5,011,254, 1991.
- 140. Christopher A Edwards, Herman M Presby, and Corrado Dragone, "Ideal microlenses for laser to fiber coupling", Journal of Lightwave Technology, 11, 252-257, 1993.
- 141. Kazuo Shiraishi, Hidenori Ohnuki, Nobuaki Hiraguri, Kazuhito Matsumura, Isamu Ohishi, Hisashi Morichi, and Hajime Kazami, "A lensed-fiber coupling scheme utilizing a graded-index fiber and a hemispherically ended coreless fiber tip", Journal of Light wave Technology, 15, 356-363, 1997.
- 142. Kenji Kawano, "Coupling characteristics of lens systems for laser diode modules using single-mode fiber", Applied Optics, 25, 2600-2605, 1986.
- 143. Yuichi Odagiri, Minoru Shikada, and Kohroh Kobayashi, "High-efficiency laser-tofibre coupling circuit using a combination of a cylindrical lens and a selfoc lens", Electronics Letters, 13, 395-396, 1977.
- 144. Masatoshi Saruwatari and Kiyoshi Nawata, "Semiconductor laser to single-mode fiber coupler", Applied Optics, 18, 1847-1856, 1979.

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control...

- 145. J Minowa, M Saruwatari, and N Suzuki, "Optical componentry utilized in field trial of single-mode fiber long-haul transmission", IEEE Transactions on Microwave Theory and Techniques, 30, 551-563, 1982.
- 146. J. Yamada, Y. Murakami, J. Sakai and T. Kimura, "Characteristics of a hemispherical microlens for coupling between a semiconductor laser and single-mode fiber", IEEE Journal of Quantum Electronics, 16, 1067-1072, 1980.
- 147. Richard P Ratowsky, Long Yang, Robert J Deri, Kok Wai Chang, Jeffrey S Kallman, and Gary Trott, "Laser diode to single-mode fiber ball lens coupling efficiency: full-wave calculation and measurements", Applied Optics, 36, 3435-3438, 1997.
- 148. Robert Gale Wilson, "Ball-lens coupling efficiency for laser-diode to single-mode fiber: comparison of independent studies by distinct methods", Applied Optics, 37, 3201-3205, 1998.
- 149. Ashima Bose, Sankar Gangopadhyay, and Subhas Chandra Saha, "Laser diode to single mode circular core graded index fiber excitation via hemispherical microlens on the fiber tip: identification of suitable refractive index profile for maximum efficiency with consideration for allowable aperture", Journal of Optical Communications, 33, 15-19, 2012.
- 150. Lars Liebermeister, Fabian Petersen, Asmus v Münchow, Daniel Burchardt, Juliane Hermelbracht, Toshiyuki Tashima, Andreas W Schell, Oliver Benson, Thomas Meinhardt, and Anke Krueger, "Tapered fiber coupling of single photons emitted by a deterministically positioned single nitrogen vacancy center", Applied Physics Letters, 104, 031101, 2014.
- 151. Amnon Yariv, "Optical Electronics: Saunders College", Saunders College Publication, 1991. (ISBN: 9780030532399).
- 152. S N Sarkar, B P Pal, and K Thyagarajan, "Lens coupling of laser diodes to monomode elliptic core fibers", Journal of Optical Communications, 7, 92-96, 1986.
- 153. S Gangopadhyay and S N Sarkar, "Laser diode to single-mode fibre excitation via hyperbolic lens on the fibre tip: Formulation of ABCD matrix and efficiency computation", Optics Communications, 132, 55-60, 1996.
- 154. Sumanta Mukhopadhyay, "Coupling of a laser diode to single mode circular core graded index fiber via parabolic microlens on the fiber tip and identification of the suitable refractive index profile with consideration for possible misalignments", Journal of Optics, 45, 312-323, 2016.
- 155. Ashima Bose, S Gangopadhyay, and S C Saha, "Simple method for study of singlemode dispersion-shifted and dispersion-flattened fibers", Journal of Optical Communications, 33, 253-258, 2012.
- 156. Jaishri Guru and Hemant Damecha, "Digital watermarking classification: a survey", International Journal of Computer Science Trends and Technology, 5, 8-13, 2014.
- 157. Ingemar J Cox, Joe Kilian, F Thomson Leighton, and Talal Shamoon, "Secure spread spectrum watermarking for multimedia", IEEE Transactions on Image Processing, 6, 1673-1687, 1997.
- 158. Darko Kirovski and Borko Furht, "Multimedia Watermarking Techniques and Applications", 1, Auerbach Publications, 2006. (ISBN: 9781420013467).
- 159. Husrev T. Sencar, Mahalingam Ramkumar and Ali N. Akansu, "Data Hiding Fundamentals and Applications: Content Security in Digital Multimedia", Elsevier, 2014. (ISBN: 9780080488660).

- 160. Olanrewaju, Rashidah Funke, Othman Omran Khalifa, and Abdul Latif, "Computational intelligence: its application in digital watermarking", Middle East Journal of Scientific Research, 13, 25-30, 2013.
- 161. J Cox Ingemar, Matthew L Miller, Jeffrey A Bloom, Jessica Fridrich, and T Kalker, "Digital watermarking and steganography", Morgan Kaufmann Publisher, 2007. (ISBN: 978-0-12-372585-1).
- 162. Manpreet Kaur, Sonika Jindal, and Sunny Behal, "A study of digital image watermarking", Journal of Research in Engineering and Applied Sciences, 2, 126-136, 2012.
- 163. Roger Woods, John McAllister, Ying Yi, and Gaye Lightbody, "FPGA-based implementation of signal processing systems", John Wiley & Sons Ltd., 2008. (ISBN: 978-0-470-03009-7).
- 164. F. G. Coelho, R. J. Cintra and V. S. Dimitrov, "Efficient Computation of the 8-point DCT via Summation by Parts", Journal of Signal Processing Systems, 90, 505-514, 2018.
- 165. Paris Kitsos, Nikolaos S Voros, Tasos Dagiuklas, and Athanassios N Skodras, "A high speed FPGA implementation of the 2D DCT for Ultra High Definition video coding", 18th International Conference on Digital Signal Processing (DSP), 1-5, 2013. (DOI: 10.1109/ICDSP.2013.6622742).
- 166. David Taubman and Michael Marcellin, "JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice", Springer Science & Business Media, 2012. (ISBN: 9781461507994).
- 167. Stephane G Mallat, "A theory for multire solution signal decomposition: the wavelet representation", IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674-693, 1989.
- 168. Wim Sweldens, "The lifting scheme: A custom-design construction of biorthogonal wavelets", Applied and Computational Harmonic Analysis, 3, 186-200, 1996.
- 169. Kishore Andra, Chaitali Chakrabarti, and Tinku Acharya, "A VLSI architecture for lifting-based forward and inverse wavelet transform", IEEE Transactions on Signal Processing, 50, 966-977, 2002.
- 170. Amit Phadikar, Goutam Kumar Maity, Tien-Lung Chiu, and Himadri Mandal, "FPGA Implementation of Lifting-Based Data Hiding Scheme for Efficient Quality Access Control of Images", Circuits, Systems, and Signal Processing, 1-27, 2018. (DOI: 10.1007/s00034-018-0893-6).
- 171. Brian Chen and Gregory W Wornell, "Quantization index modulation: A class of provably good methods for digital watermarking and information embedding", IEEE Transactions on Information Theory, 47, 1423-1443, 2001.
- 172. Ahmet M Eskicioglu, "Application of multidimensional quality measures to reconstructed medical images", Optical Engineering, 35, 778-786, 1996.
- 173. Ahmet M Eskicioglu and Paul S Fisher, "Image quality measures and their performance", IEEE Transactions on Communications, 43, 2959-2965, 1995.
- 174. Christian J. van den Branden Lambrecht, "Vision models and applications to image and video processing ", Springer Science & Business Media, 2013. (ISBN: 978-1-4419-4905-9).
- 175. Ismail Avcibas, Bulent Sankur, and Khalid Sayood, "Statistical evaluation of image quality measures", Journal of Electronic imaging, 11, 206-224, 2002.

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control ...

- 176. Carl E Halford, Keith A Krapels, Ronald G Driggers, and Eddie Burroughs, "Developing operational performance metrics using image comparison metrics and the concept of degradation space", Optical Engineering, 38, 836-845, 1999.
- 177. Rafael C Gonzalez and Richard E Woods, "Digital Image Processing", 3, Pearson, 2007. (ISBN: 978-81-317-2695-2).
- 178. Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli, "Image quality assessment: from error visibility to structural similarity", IEEE Transactions on Image Processing, 13, 600-612, 2004.
- 179. Chion-Ting Hsu and Ja-Ling Wu, "Multiresolution watermarking for digital images", IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 45, 1097-1101, 1998.
- Tryphon T Georgiou and Anders Lindquist, "Kullback-Leibler approximation of spectral density functions", IEEE Transactions on Information Theory, 49, 2910-2917, 2003.
- 181. Arden Wolfgang M, "The international technology roadmap for semiconductors perspectives and challenges for the next 15 years", Current Opinion in Solid State and Materials Science, 6, 371-377, 2013.
- 182. Naresh Grover and Soni M, "Reduction of power consumption in fpgas-an overview", International Journal of Information Engineering and Electronic Business, 4, 50-69, 2012.
- 183. Woods Roger, John McAllister, Ying Yi and Gaye Lightbody, "FPGA-based implementation of signal processing systems ", John Wiley & Sons, 2008. (ISBN: 978-0-470-03009-7).
- 184. C. Nagendra, M.J. Irwin and R.M. Owens, "Area-time-power tradeoffs in parallel adders", IEEE Circuits and Systems Society, 43, 689-702, 1996.
- 185. Jeffrey D Ullman, "Computational aspects of VLSI", Computer Science Press, 1984.
- 186. Kai Chen, Chenming Hu, Peng Fang, Min Ren Lin, and Donald L Wollesen, "Predicting CMOS speed with gate oxide and voltage scaling and interconnect loading effects", IEEE Transactions on Electron Devices, 44, 1951-1957, 1997.
- 187. Michael J Flynn, Patrick Hung, and Kevin W Rudd, "Deep submicron microprocessor design issues", IEEE Micro, 19, 11-22, 1999.
- 188. Abdellatif Bellaouarand Mohamed Elmasry, "Low-Power Digital VLSI Design: Circuits and Systems", Springer Science & Business Media, 2012. (ISBN: 9781461523550).
- 189. John L Hennessy, and David A Patterson, "Computer architecture: a quantitative approach", 5, Morgan Kaufmann Publication, 2011. (ISBN: 978-0-12-383872-8).
- 190. David E Culler, Jaswinder Pal Singh, and Anoop Gupta, "Parallel computer architecture: a hardware/software approach", Gulf Professional Publishing, 1999. (ISBN: 1558603433).
- 191. Donald G Bailey, "Design for embedded image processing on FPGAs", John Wiley & Sons, 2011. (ISBN: 0470828528).
- 192. Alan Moore, "Embedded Systems: High Performance Systems, Applied Principles and Practice", Clanrye International, 2015. (ISBN: 9781632401694).
- 193. Quartus II, " Handbook Version 11.1: Design and Synthesis ", Altera Corporation, 2011.
- 194. INC Xilinx, "Xilinx University Program Virtex-II Pro Development System Hardware Reference Manual", UG069 (v1. 0), March, 8, 2005.

- 195. Shih-Lun Lin, Chien-Feng Huang, Meng-Huan Liou, and Chien-Yuan Chen, "Improving histogram-based reversible information hiding by an optimal weight-based prediction scheme", Journal of Information Hiding and Multimedia Signal Processing, 4, 19-33, 2013.
- 196. Amit Joshi, Vivekanand Mishra, and Rajendra Patrikar, "Real Time Implementation of Integer DCT based Video Watermarking Architecture", International Arab Journal of Information Technology (IAJIT), 12, 741-747, 2015.
- 197. Amit Phadikar, Malay K Kundu, and Santi P Maity, "Quality access control of a compressed gray scale image", Proc. of Conf. On Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG 08), 13-19, 2008.
- 198. Himadri Mandal, Goutam Kr Maity, Amit Phadikar, and Tien-Lung Chiu, "FPGA based low power hardware implementation for quality access control of a compressed gray scale image", International Conference on Computational Intelligence, Communications, and Business Analytics, 416-430, 2017.
- 199. Himadri Mandal, Amit Phadikar, Goutam Kr Maity, and Tien-Lung Chiu, "FPGA based low power hardware for quality access control of compressed gray scale image", Microsystem Technologies, 1-14, 2018. (DOI: 10.1007/s00542-018-3817-2).
- 200. Wen-Hsiung Chen, CH Smith, and SC Fralick, "A fast computational algorithm for the discrete cosine transform", IEEE Transactions on Communications, 25, 1004-1009, 1977.
- 201. Amit Phadikar, Santi P Maity, and MK Mandal, "Quantization based data hiding scheme for quality access control of images", Proc. of the 12th IASTED International Conference on Internet and Multimedia Systems and Applications, 113-118, 2008. (DOI: 10.1109/ICVGIP.2008.23).
- 202. Himadri Mandal, Amit Phadikar, Goutam Kr Maity, and Tien-Lung Chiu, "FPGA based low power hardware implementation for quality access control of digital image using dither modulation", Devices for Integrated Circuit (DevIC), 642-646, 2017.
- 203. Guangyi Yang, Yue Liao, Qingyi Zhang, Deshi Li and Wen Yang, "No-Reference Quality Assessment of Noise-Distorted Images Based on Frequency Mapping", IEEE Access, 5, 23146-23156, 2017.
- 204. Hojatollah Yeganeh and Zhou Wang, "Objective Quality Assessment of Tone-Mapped Images", IEEE Transactions on Image Processing, 22, 657-667, 2013.
- 205. Hichem Belhadj, Vishal Aggrawal, Ajay Pradhan, and Amal Zerrouki, "Power-aware FPGA design", Actel Corporation White Paper, 75, 2009.
- 206. Philippe Garrault and Brian Philofsky, "HDL coding practices to accelerate design performance", Xilinx White Paper, 1-22, 2006.
- 207. Torresani Bruno, Wickerhauser Victor, "Wavelet Analysis and Active Media Technology", World Scientific, 2005. (ISBN: 9789814479738).
- 208. M Nagabushanam and S Ramachandran, "Fast implementation of lifting based 1D/2D/3D DWT-IDWT architecture for image compression", International Journal of Computer Applications, 51, 35-41, 2012.
- 209. Rafael C Gonzalez, Richard E Woods, and Steven L Eddins, "Digital image processing using MATLAB", Pearson-Prentice-Hall Upper Saddle River, New Jersey, 2004. (ISBN-10: 0982085400).
- 210. Thomas M. Cover and Joy A. Thomas, "Elements of Information Theory", 2, Wiley, 2006. (ISBN: 978-0-471-24195-9).

FPGA Based Reconfigurable Hardware Architecture for Quality Access Control...

- 211. Jeanne Chen, Wien Hong, Tung-Shou Chen, and Chih-Wei Shiu, "Steganography for BTC compressed images using no distortion technique", The Imaging Science Journal, 58, 177-185, 2010.
- 212. C. A. Edwards and Herman M Presby, "Coupling-sensitivity comparison of hemispheric and hyperbolic microlenses", Applied optics, 32, 1573-1577, 1993.
- 213. K Kurokawa and EE Becker, "Laser Fiber Coupling with a Hyperbolic Lens (Short Papers)", IEEE Transactions on Microwave Theory and Techniques, 23, 309-311, 1975.
- 214. Shouguo Zheng, Xinhua Zeng, Wei Luo, Safi Jradi, Jérôme Plain, Miao Li, Philippe Renaud-Goud, Régis Deturche, Zeng Fu Wang, Jieting Kou, Renaud Bachelor, and Pascal Royer, "Rapid fabrication of micro-nanometric tapered fiber lens and characterization by a novel scanning optical microscope with submicron resolution", Optics Express, 21, 30-38, 2013.
- 215. S Perumal Sankar, N Hariharan, and R Varatharajan, "A novel method to increase the coupling efficiency of laser to single mode fibre", Wireless Personal Communications, 87, 419-430, 2016.
- 216. Ajoy Ghatak and K Thyagarajan, "An introduction to fiber optics", Cambridge university press, 1998. (ISBN: 9780521577854).
- 217. John M Senior, and M Yousif Jamro, "Optical fiber communications: principles and practice", Pearson Education, 2009. (ISBN: 9780130326812).
- 218. D Marcuse, "Gaussian approximation of the fundamental modes of graded-index fibers", Journal of the Optical Society of America, 68, 103-109, 1978.
- 219. Somenath Sarkar, K Thyagarajan, and Arun Kumar, "Gaussian approximation of the fundamental mode in single mode elliptic core fibers", Optics Communications, 49, 178-183, 1984.
- 220. JP Meunier, J Pigeon, and J N Massot, "Perturbation theory for the evaluation of the normalized cutoff frequencies in radially inhomogeneous fibres", Electronics Letters, 16, 27-29, 1980.
- 221. K I White, "Design parameters for dispersion-shifted triangular-profile single-mode fibres", Electronics Letters, 18, 725-727, 1982.
- 222. Enakshi Sharma, I Goyal, and A Ghatak, "Calculation of cutoff frequencies in optical fibers for arbitrary profiles using the matrix method", IEEE Journal of Quantum Electronics, 17, 2317-2321, 1981.
- 223. Soumita Chakraborty, Debar up Roy, Sumanta Mukhopadhyay, and Somenath Sarkar, "An investigative study of efficient coupling mechanism of a hemispherical microlens tipped single mode photonic crystal fiber to a laser diode by ABCD matrix formulation and determination of the optimal separation distance", Optik-International Journal for Light and Electron Optics, 149, 81-89, 2017.
- 224. Sumanta Mukhopadhyay, "Misalignment studies in laser diode to hemispherical microlens tipped circular core photonic crystal fiber excitation", Journal for Foundations and Applications of Physics, 4, 74-100, 2017.
- 225. Binghua Su, Junwen Xue, Lu Sun, Huiyuan Zhao, and Xuedan Pei, "Generalized ABCD matrix treatment for laser resonators and beam propagation", Optics & Laser Technology, 43, 1318-1320, 2011.
- 226. Aswini Kumar Mallick, Sumanta Mukhopadhyay, and Somenath Sarkar, "Coupling of a laser diode to single mode circular core trapezoidal index fiber via hyperbolic microlens on the fiber tip and construction of empirical relations to determine the optimum back focal length", Optik-International Journal for Light and Electron Optics, 127, 11418-11426, 2016.

## About the book

In the era of advanced communication technology, the distribution and storage of digital multimedia content from one source to an endless variety of uses in any part of the globe become very easy. In last few decades, data hiding techniques through the watermarking process have proven its potentiality in the domain of information security such as image quality access control, copyright protection, broadcast monitoring, digital right management etc. Side by side, the optical fiber has also emerged as a potential candidate in the field of communication owing to it large bandwidth (~Ghz) and low loss (~0.15 dB/Km). Thus analysis of optical circuitry has attracted global interest recently. This book focuses FPGA based image quality access control and coupling optics in fiber communication. In the first part of this book, field programmable gate array (FPGA) based very large scale integration architecture is developed for real-time application of quality access control of the digital image. The primary objective of the real-time reconfigurable hardware design of the quality access control scheme is to serve miscellaneous intent i.e. low power consumption, high throughput, reliability, low cost, and easily integrating with existing consumer electronic devices. In the second part of this book, the theoretical investigation of coupling efficiency involving laser diode to single-mode circular core triangular index fiber coupling via upside-down tapered hemispherical microlens on the tip of the fiber in presence of possible transverse and angular mismatches has been presented. Employing ABCD matrix formalism, analytical formulations for the concerned coupling optics are prescribed.

This book would be useful for research scholars, engineers, mobile & video company, banking service, computer scientists, forensic officer, medical industry, manufacturers & vendors and technicians.





Dr. Himadri Mandal receives his M-fech in VLSI design from Bengal Engineering and Science University, Shibpur, India, and Ph.D. in the Department of Photonics Engineering, Yuan Ze University, Taiwan, R.O.C in the year Nov. 2018. Presently he is working as Associate Professor in the Dept. of Electronics & Communication Engineering, Calcutta Institute of Technology, Uluberia, Howrah W.B., India. He has contributed about 18 research papers and one book.



**Dr. Amit Phadikar** received his B.E.(Hons) in Computer Sci. & Engineering from Vidyasagar University, W.B., M.Tech degree in Information Technology(Artificial Intelligence) from Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal, and PhD degree in Engg. (Information Technology) from Bengal Engineering and and Science University (Renamed: Indian Institute of Engineering Science and Technology (IIEST)), Shibpur, India. Dr. Phadikar joined at Dept of Information Technology, MCKV Institute of Engineering (NAAC Accredited "A" Grade Autonomous Institute), Liluah, Howrah, in 2005 as Lecturer. Presently he is working as Professor in the same department and was HOD in the same department during 2011-2015. During Oct, 2006- Oct, 2007, he did a collaborative research project work (Project Title: Video and Hard Copy Watermarking using Soft-Computing Methodology) with "Centre for Soft Computing Research-A National Facility", Indian Statistical Institute, Kolkata. He received National Scholarship (Through Graduate Aptitude Test in Engineering (GATE)) for his M.Tech Study. He has contributed about 70 research papers, one book chapter, and two books and guided one PhD student.



Kripa-Drishti Publications A-503 Poorva Heights, Pashan-Sus Road, Near Sai Chowk, Pune – 411021, Maharashtra, India. Mob: +91 8007068686 Email: editor@kdpublications.in Web: https://www.kdpublications.in

