# On the co-design of electronics and photonics for optical communication

### Bahaa Radi



# Department of Electrical & Computer Engineering McGill University Montréal, Québec, Canada

June 23, 2020

A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of

Doctor of Philosophy

©2020 Bahaa Radi

## Abstract

The explosive growth of internet traffic has led to an increase in data communication within data centers that use optical interconnects for communication. This calls for the development of low-power, high-speed, and high-sensitivity optical receivers to support this increased communication. Additionally, with silicon photonic technologies offering new design opportunities, the co-design of electronics and photonics could lead to more efficient, improved receivers. Under the umbrella of electronics/photonics co-design, this thesis explores two themes: 1) the design of energy-efficient optical receivers that leverage silicon photonics to replace clock phase generation circuits and, 2) the implementation of passives, that are conventionally found on the electronic side, in the silicon photonic technology stack.

Under the first theme, three optical receivers are designed. The first receiver is a conventional front-end 12.5 Gb/s optical receiver that uses a silicon photonic structure to split the input signal into four data streams and delay each stream by one bit relative to the next. In this fashion, demultiplex-by-four outputs, that are conventionally obtained by

Abstract

using four quadrature phases, can be obtained with only a single clock phase. technique reduces the power consumption of clock phase generation from 30-45 % as reported in the literature to around 10 % of the total power consumption of the receiver. This receiver has an energy efficiency of 1.93 pJ/bit and a sensitivity of -4 dBm excluding optical splitting losses. The second receiver improves upon the first receiver by using a high-bandwidth gain-improved transimpedance amplifier with a pseudo-differential-output followed by a comparator. This receiver also reduces the number of data streams from four to two, decreasing the optical losses at the input and simplifying the receiver. This receiver achieves a speed of 17 Gb/s, a sensitivity of -7 dBm including 3 dB optical losses, and an energy efficiency of 156 fJ/bit, the best reported energy-efficiency at the time of writing this thesis. Finally, to improve the speed, a third receiver with a novel two-bit integrating front was developed. The low-bandwidth front-end of this receiver allows the receiver to operate at higher data rates for a given bandwidth and addresses the issue of having conventional power-hungry transimpedance amplifiers at the input that do not scale well with the technology node. This receiver achieves a speed of 22 Gb/s. All three receivers are fabricated and validated experimentally.

The second theme explores three passive structures implemented in the silicon photonics stack. The first is a 15 GHz on-chip monopole antenna. The antenna is designed, fabricated, and RF inter-chip data transmission in silicon photonics is demonstrated for the first time. The second structure explores the design of a low-pass filter driven by a

Abstract

photodetector. Design methodology, fabrication, and measurements are presented and compared to simulation values. Finally, a moving average filter is developed using optical delay lines, photodetectors, and a capacitor. Experimental validation confirms proper 1-bit moving average operation at 2.5 Gb/s, and 2-bits moving average operation at 5 Gb/s.

## Abrégé

La croissance explosive du trafic Internet a entraîné une augmentation de la communication de données au sein des centres de données qui utilisent des interconnexions optiques pour la communication. Cela nécessite le développement de récepteurs optiques à basse consommation, haute vitesse et haute sensibilité pour soutenir cette communication accrue. De plus, avec les nouvelles opportunités d'innovation offertes par les technologies photoniques en silicium, la co-conception de l'électronique et de la photonique pourrait mener à des récepteurs plus efficaces et améliorés. Sous l'égide de la co-conception électronique / photonique, cette thèse explore deux thèmes: 1) la conception de récepteurs optiques écoénergétiques qui exploitent la photonique au silicium pour remplacer les circuits de génération de phase d'horloge et, 2) la mise en œuvre d'éléments passifs, qui sont conventionnellement trouvés du côté électronique, dans la technologie photonique en silicium.

À propos du premier thème, trois récepteurs optiques sont conçus. Le premier récepteur est un récepteur optique frontal classique de 12.5 Gb/s qui utilise une structure photonique

Abrégé v

en silicium pour diviser le signal d'entrée en quatre flux de données et retarder chaque flux d'un bit par rapport au suivant. De cette manière, les sorties démultiplexées par quatre, qui sont généralement obtenues en utilisant quatre phases en quadrature, peuvent être obtenues avec une seule phase d'horloge. Cette technique réduit la consommation d'énergie dans la génération de la phase d'horloge de 30 à 45 % comme indiqué dans la littérature à environ 10 % de la consommation d'énergie totale du récepteur. Ce récepteur a une efficacité énergétique de 1.93 pJ/bit et une sensibilité de -4 dBm excluant les pertes de division optique. Le deuxième récepteur améliore le premier récepteur en utilisant un amplificateur à transimpédance à bande passante élevée et à gain amélioré avec une sortie pseudo-différentielle suivie d'un comparateur. Ce récepteur réduit également le nombre de flux de données de quatre à deux, ce qui diminue les pertes optiques à l'entrée et simplifie le récepteur. Ce récepteur est capable d'atteindre une vitesse de 17 Gb/s, une sensibilité de -7 dBm incluant des pertes optiques de 3 dB, et une efficacité énergétique de 156 fJ/bit, ce qui est la meilleure efficacité énergétique rapportée au moment de la rédaction de cette thèse. Enfin, pour améliorer la vitesse, un troisième récepteur avec un nouveau front d'intégration à deux bits a été développé. Ce récepteur frontal à faible bande passante permet au récepteur de fonctionner à des débits de données plus élevés pour une bande passante donnée et résout le problème d'avoir à l'entrée des amplificateurs de transimpédance énergivores qui ne s'adaptent pas bien avec le nœud technologique. Ce récepteur atteint une vitesse de 22 Gb/s. Les trois récepteurs sont fabriqués et validés expérimentalement.

Abrégé vi

Le deuxième thème examine trois structures passives implémentées dans la technologie photonique en silicium. Le premier est une antenne monopôle sur puce de 15 GHz. L'antenne est conçue, fabriquée et la transmission de données RF entre puces en photonique sur silicium est démontrée pour la première fois. La deuxième structure explore la conception d'un filtre passe-bande connecté à un photodétecteur. La méthodologie de conception, la fabrication et les mesures sont présentées et comparées aux valeurs de simulation. Enfin, un filtre à moyenne mobile est développé à l'aide de lignes à retard optiques, de photodétecteurs et d'un condensateur. La validation expérimentale confirme le fonctionnement de la moyenne mobile à 1 bit à 2.5 Gb/s et le fonctionnement à moyenne mobile de 2 bits à 5 Gb/s.

## Acknowledgements

I would like to thank my advisor, Prof. Odile Liboiron-Ladouceur for this fascinating journey. This research would not have been possible without her guidance, discussions, and encouragement. She also helped me to improve my presentation skills and writing skills. I am truly grateful for her advice. She also though me to take some opportunities that may not look attractive at first, but then proved to be very valuable. I learned more than just research skills from her.

I would like to also acknowledge Prof. Frédéric Nabki and Prof. Michaël Ménard with their close involvement with the majority of my research. The provided constant feedback and took the time to meet very frequently with me to support this research and thesis. I thank them very much for their help and support. This research would not be the same without their help.

I would like to thank Prof. Gordon Roberts for being in my PhD committee and providing some interesting general feedback on the work I was doing. I learned a lot from his insight.

I would like to thank Prof. Tony Chan Carusone for agreeing to take the time to read

my thesis and to be the external examiner. I would also like to thank Prof. Ke Wu and Prof. Boris Vaisband for taking the time to be on the defense committee.

I would like to specifically thank Mohammadreza Sanadgol Nezami, Mohammad Taherzadeh-Sani, Ajaypal Singh Dhillon, Vernon Elmo-Paul, and Yule Xiong for their help with my research and for being excellent collaborators.

I acknowledge the financial support I received from McGill university through thes McGill Engineering Doctoral Awards (MEDA). I would not have been able to pursue a PhD without this. I also acknowledge the contribution of the Canadian Microelectronics Corporation (CMC) for giving me access to their design tools and for subsidizing the fabrication of my chips. I also acknowledge the professors of The Photonic Systems Group who allowed me to use the equipment needed to measure my chips.

I would like to thank some of my close friends, Ahmad Saleh, Kahlil Hindawi, Feras Elsaid, and Yazan Alem who shared my struggles of doing a PhD and certainly made it more bearable.

I would like to thank my fiancée Rwan Salah for being very supportive. I am very lucky to have met you in my final year while you made my life much more exciting. You are the flower of my life.

Finally, my deepest appreciation goes to my parents, Sanaa Ghneam and Ishaq Radi, who offered the utmost genuine and honest unconditional support that a human being could ever receive. I would not be where I am today without them. I dedicate this work to them.

# Contents

| 1 | Intr | roduction |                                                          |    |
|---|------|-----------|----------------------------------------------------------|----|
|   | 1.1  | Motiv     | ation                                                    | 2  |
|   | 1.2  | Thesis    | s objectives                                             | 3  |
|   | 1.3  | Claim     | of Originality                                           | 6  |
|   |      | 1.3.1     | Publications and contributions of the author             | 8  |
|   | 1.4  | Thesis    | s Organization                                           | 14 |
| 2 | Bac  | kgrou     | nd                                                       | 16 |
|   | 2.1  | An ov     | erview of silicon photonics                              | 16 |
|   |      | 2.1.1     | Motivation for passives electronics in Silicon Photonics | 21 |
|   |      | 2.1.2     | A note on process variations in Silicon Photonics        | 23 |
|   | 2.2  | Motiv     | ation for optical versus electronic delay lines          | 24 |
|   |      | 2.2.1     | CMOS Delay Lines                                         | 25 |
|   |      | 222       | Optical delay lines                                      | 29 |

Contents x

|   |            | 2.2.3 T    | Trade-off considerations                                                  | 32 |
|---|------------|------------|---------------------------------------------------------------------------|----|
|   | 2.3        | Optical r  | receivers with demultiplexed output                                       | 34 |
|   |            | 2.3.1 B    | Brief overview of conventional source-synchronous optical receivers       | 34 |
|   |            | 2.3.2 C    | Clock phases in optical receivers                                         | 35 |
|   |            | 2.3.3 F    | ront-end bandwidth                                                        | 37 |
|   | 2.4        | Optical s  | split and delay structure                                                 | 38 |
|   | 2.5        | Summar     | y                                                                         | 40 |
| 3 | An         | Optical I  | Receiver Exploiting SiP Delays for Clock Phases Replacement               | 41 |
|   | 3.1        | Introduc   | tion                                                                      | 41 |
|   | 3.2        | System A   | Architecture                                                              | 43 |
|   | 3.3        | PCB par    | rasitics and AC coupling noise analysis                                   | 47 |
|   | 3.4        | Experim    | ental validation                                                          | 52 |
|   | 3.5        | Discussio  | on                                                                        | 58 |
|   | 3.6        | Conclusion | on                                                                        | 59 |
| 4 | <b>A</b> 1 | 7 Gbps 1   | 156 fJ/bit Two-Channel Optical Receiver in 65 nm CMOS                     | 61 |
|   | 4.1        | Two-chai   | nnel optical receivers overview                                           | 62 |
|   |            | 4.1.1 T    | Wo-phase clocked receiver                                                 | 62 |
|   |            | 4.1.2 T    | Two channel optical split and delay receiver                              | 63 |
|   | 4.2        | Design o   | of the two-channel electronic receiver with optical-input split and delay | 66 |

Contents xi

|   |     | 4.2.1        | Electronic receiver architecture                                       | 67  |
|---|-----|--------------|------------------------------------------------------------------------|-----|
|   |     | 4.2.2        | Transimpedance amplifier with single-ended-input and differential      |     |
|   |     |              | output                                                                 | 69  |
|   |     | 4.2.3        | High-speed comparator with offset nulling and latch                    | 74  |
|   | 4.3 | Measu        | rement results                                                         | 77  |
|   | 4.4 | Concl        | usion                                                                  | 83  |
| 5 | A 2 | $2~{ m Gb}/$ | s Time-Interleaved Optical Receiver with Integrating Front-end         | 85  |
|   | 5.1 | Introd       | luction                                                                | 86  |
|   | 5.2 | Low-b        | andwidth receiver architecture                                         | 88  |
|   |     | 5.2.1        | Integrating receiver front-end                                         | 88  |
|   |     | 5.2.2        | Resettable receiver, current-amplifier-based receivers, and integrate- |     |
|   |     |              | and-dump receiver                                                      | 92  |
|   | 5.3 | Propo        | sed time-interleaved receiver                                          | 97  |
|   |     | 5.3.1        | Architecture and operation                                             | 97  |
|   |     | 5.3.2        | Analysis of the integration                                            | 103 |
|   |     | 5.3.3        | Noise in the two-bit integrating front-end receiver and input          |     |
|   |     |              | capacitance impact on SNR                                              | 107 |
|   |     | 5.3.4        | Detailed circuit implementation                                        | 109 |
|   | 5.4 | Exper        | imental results                                                        | 115 |
|   | 5.5 | Discus       | ssion                                                                  | 121 |

Contents xii

|   | 5.6  | A note on the silicon-photonic structures compatible with the two-bit |     |
|---|------|-----------------------------------------------------------------------|-----|
|   |      | integrating front-end receiver                                        | 129 |
|   | 5.7  | Conclusion                                                            | 134 |
| 6 | Den  | nonstration of Inter-chip Transmission with On-Chip Antennas in SiP   | 135 |
|   | 6.1  | Introduction                                                          | 136 |
|   | 6.2  | Antenna design                                                        | 138 |
|   | 6.3  | Experimental validation                                               | 142 |
|   | 6.4  | Discussion                                                            | 146 |
|   | 6.5  | Conclusion                                                            | 148 |
| 7 | Inte | egrated RF Passive Low-Pass Filters in Silicon Photonics              | 149 |
|   | 7.1  | Introduction                                                          | 149 |
|   | 7.2  | Circuit model and design parameters                                   | 152 |
|   | 7.3  | Experimental results                                                  | 156 |
|   | 7.4  | Conclusion                                                            | 161 |
| 8 | A H  | High-speed Moving Average Integrator in SiP for TIA-less Receivers    | 162 |
|   | 8.1  | Introduction                                                          | 163 |
|   | 8.2  | Design methodology                                                    | 164 |
|   | 8.3  | Data recovery principle                                               | 168 |
|   | 8.4  | Experimental procedure and results                                    | 171 |

|   | 8.5 | Concl            | $\operatorname{usion}$                             | 173 |
|---|-----|------------------|----------------------------------------------------|-----|
| 9 | Pro | $\mathbf{posed}$ | Future Work and Conclusions                        | 176 |
|   | 9.1 | Propo            | sed future work                                    | 176 |
|   |     | 9.1.1            | Final integration of photonic and electronic chips | 177 |
|   |     | 9.1.2            | Monolithic design                                  | 177 |
|   |     | 9.1.3            | Clock recovery for the optical receivers           | 178 |
|   |     | 9.1.4            | A silicon photonics based delay locked loop        | 179 |
|   | 9.2 | Concl            | usions                                             | 179 |

# List of Figures

| 1.1 | Data traffic per year in data centers [1]                                  | 2  |
|-----|----------------------------------------------------------------------------|----|
| 1.2 | Data traffic breakdown by destination [1]                                  | 2  |
| 2.1 | Silicon photonics process cross-section [17]                               | 17 |
| 2.2 | Silicon photonics process cross-section [17]                               | 18 |
| 2.3 | Photodetector model in [19]                                                | 19 |
| 2.4 | Layout of an N-channel MOSFET next to two MIM capacitors illustrating      |    |
|     | that passives can much larger area compared to transistors                 | 22 |
| 2.5 | Summary of benefits of relocating passives from the IC side to the silicon |    |
|     | photonics side.                                                            | 24 |
| 2.6 | Inverter-based tapped delay line                                           | 26 |
| 2.7 | Single output delay line                                                   | 27 |
| 2.8 | Two path delay line capable of sub-gate delays                             | 27 |

List of Figures xv

| 2.9  | Analog buffer that can be used to delay analog signals and controlled through |    |
|------|-------------------------------------------------------------------------------|----|
|      | adjusting the load                                                            | 28 |
| 2.10 | Supply controlled delay line                                                  | 29 |
| 2.11 | Current starved delay line                                                    | 30 |
| 2.12 | Optical delay line model                                                      | 30 |
| 2.13 | Electronically tunable optical delay lines                                    | 32 |
| 2.14 | Conventional optical receiver with demultiplexed outputs                      | 36 |
| 2.15 | Sampling in conventional four clock phase receiver (left) and two clock phase |    |
|      | receiver (right)                                                              | 36 |
| 2.16 | An optical receiver architecture that uses split and delay functionality to   |    |
|      | produce demultiplexed outputs                                                 | 37 |
| 2.17 | Sampling in using silicon photonics structure with four quarter-rate outputs  |    |
|      | (left) and two half-rate outputs (right)                                      | 38 |
| 2.18 | Optical split-delay structure                                                 | 36 |
| 3.1  | (a) Sampling in conventional optoelectronic receiver with four clock phases.  |    |
|      | (b) Sampling in the proposed system using four different delays               | 43 |
| 3.2  | The SiP chip is outlined in red while the IC is outlined in blue              | 44 |
| 3.3  | Detailed circuit implementation of one of the four sub-receivers              | 46 |
| 3.4  | Sampling of amplified bits to generate quarter rate output                    | 47 |
| 3.5  | Ac simulations with idea photodetector                                        | 48 |

List of Figures xvi

| 3.6  | A micrograph of the IC chip. The IC is wire bonded to a QFN-80 package                 |    |
|------|----------------------------------------------------------------------------------------|----|
|      | (not shown) mounted on a high-speed low-loss RO4350B PCB                               | 49 |
| 3.7  | Model of the PCB interconnect and package between the PD and the input                 |    |
|      | of each sub receiver                                                                   | 50 |
| 3.8  | AC simulations comparing the performance of the IC chip when wire bonded               |    |
|      | to a PD compared to the IC mounted on a PCB                                            | 51 |
| 3.9  | The fraction of accepted power versus frequency due to parasitics at the input.        | 52 |
| 3.10 | Test setup for the IC in the receiver using an external photodetector module.          |    |
|      | The dashed outline represents the components used to emulate the SiP chip.             |    |
|      | Dotted lines represent the change of the connection between the PD and the             |    |
|      | input of the IC for testing of each of the four channels                               | 54 |
| 3.11 | The measured BER curve for each of the sub receivers with the four output              |    |
|      | eye diagrams with a data rate of $3.125~\mathrm{Gb/s}$ . Output eye diagrams are shown |    |
|      | for $10^{-12}$ for Rx1-Rx3 and approximately $10^{-6}$ for Rx4                         | 55 |
| 3.12 | Bathtub curve measurements of the receiver for a 12.5 Gb/s input                       | 56 |
| 3.13 | Power consumption breakdown                                                            | 57 |

List of Figures xvii

| 4.1  | (a) A conventional two-channel receiver architecture that splits the paths after          |    |
|------|-------------------------------------------------------------------------------------------|----|
|      | the TIA, and requires the Clock and $Clock\_180^o$ (i.e., the clock signal that           |    |
|      | is shifted by $180^{\circ}$ ) phases; (b) a two-channel receiver architecture that splits |    |
|      | the paths before the PD, and requires the Clock and $Clock_{-}180^{o}$ phases; (c)        |    |
|      | the proposed two-channel receiver architecture that splits the paths before the           |    |
|      | PD and only requires one clock phase for its comparators                                  | 64 |
| 4.2  | Silicon photonics (SiP) split-delay structure schematic with envisioned                   |    |
|      | integration with the electronic IC chip                                                   | 67 |
| 4.3  | The system-level details of the implemented two-channel receiver                          | 68 |
| 4.4  | The TIA circuit and its connections to the PD and comparator                              | 70 |
| 4.5  | The bode diagram for (a) $(V_{OUT}-V_{IN})/I_{IN}$ and $V_{OUT}$ / $I_{IN}$ ; (b)         |    |
|      | $(V_{OUT} - V_{IN}) / I_{IN}$ with and without the damping resistor $R_{LF}$              | 73 |
| 4.6  | The dynamic comparator and latch with offset-nulling signals $V_{BP}$ and $V_{BN}$ .      | 74 |
| 4.7  | Packaged chip micrograph of the receiver and its connections to                           |    |
|      | photodetectors with 1 mm bondwires                                                        | 78 |
| 4.8  | Experimental test setup used to validate the optical receiver                             | 80 |
| 4.9  | Bit error rate (BER) for a 17 Gbps PRBS 10 optical input signal of the full               |    |
|      | receiver versus the input OMA at the input of the splitter and considering its            |    |
|      | 3.3 dB loss for the splitter                                                              | 81 |
| 4.10 | Full receiver bathtub curve at a 17 Gbps input                                            | 81 |

List of Figures xviii

| 4.11 | The eye diagram of the input and output signals. Since the output signal                        |    |
|------|-------------------------------------------------------------------------------------------------|----|
|      | is single-ended and has a small amplitude, it is slightly distorted by some                     |    |
|      | common-mode noise                                                                               | 82 |
| 5.1  | Simplified integrating front-end receiver architecture with the four clock                      |    |
|      | phases $\Phi_1$ to $\Phi_4$                                                                     | 90 |
| 5.2  | Voltage at the input of the sampling circuit when the sequence 1110 is received.                |    |
|      | $\Delta v_x$ (x = 1,2,3,4) is the voltage difference between two consecutive samples.           | 91 |
| 5.3  | Basic operation of the dynamic offset modulation (DOM) in the receiver to                       |    |
|      | compensate for CID. The red arrows indicate the offset generated by the                         |    |
|      | DOM circuit to compensate the $\Delta v$ shown in Fig. 5.2 and clamps the voltage               |    |
|      | difference to $\pm (\Delta v_{max})/2$                                                          | 91 |
| 5.4  | Resettable receiver architecture operation and timing diagram showing the                       |    |
|      | integration for 0.5 UI and reset for 0.5 UI                                                     | 93 |
| 5.5  | Current-amplifier-based receiver architecture showing two interleaved paths                     |    |
|      | and sampling using two phases $(\Phi, \overline{\Phi})$ and a delayed version of the two phases |    |
|      | $(\Phi_d, \overline{\Phi_d}))$                                                                  | 94 |
| 5.6  | Timing and operation of the current-amplifier-based receiver showing the reset                  |    |
|      | (0.25 UI), sample (0.75 UI), and hold phases (1 UI)                                             | 95 |
| 5.7  | Integrate-and-dump receiver showing four interleaved paths. It utilizes four                    |    |
|      | clock phases $(\Phi_1, \Phi_2, \overline{\Phi_1}, \overline{\Phi_2})$                           | 96 |

List of Figures xix

| 5.8  | Timing and operation of the of the integrate-and-dump receiver showing the                  |      |
|------|---------------------------------------------------------------------------------------------|------|
|      | four phases: internal reset, external reset, integrate, and hold                            | 96   |
| 5.9  | Block diagram of the four proposed sub receivers and connection to the optical              |      |
|      | blocks with the delay scheme used. A single pole PD model is shown in the                   |      |
|      | inset                                                                                       | 98   |
| 5.10 | Timing diagram showing the operation of the receiver and the two phases of                  |      |
|      | operation                                                                                   | 99   |
| 5.11 | Voltage integration $(\Delta v)$ at the front-end for all possible input values when        |      |
|      | the bandwidth of the photodetector is higher than 0.7 of the data-rate. The                 |      |
|      | bottom part shows an overlay of all $\Delta v$ possibilities                                | 101  |
| 5.12 | $\Delta v$ when the bandwidth of the photodetector is lower than 0.7 of the data-rate       | .102 |
| 5.13 | The ratio of $\Delta v_{01}/\Delta v_{0.75T_b}$ vs photodetector bandwidth (in terms of bit |      |
|      | duration) in the case of the proposed receiver over that of the                             |      |
|      | current-amplifier-based receiver                                                            | 106  |
| 5.14 | Simulated SNR versus $\mathcal{C}_{IN}$ showing improvement with a smaller capacitance.     | 109  |
| 5.15 | Detailed circuit implementation of one of the four sub receivers in the proposed            |      |
|      | receiver. The input is wire bonded to a photodetector                                       | 111  |
| 5.16 | AC simulation gain of the amplifier stages                                                  | 113  |
| 5.17 | The output power of the two amplifier stages versus the input power. The                    |      |
|      | input-referred 1-dB compression point is -13.4 dBm                                          | 113  |

List of Figures xx

| 5.18 | Two-tone test showing the fundamental and the third-order harmonic powers.    |     |
|------|-------------------------------------------------------------------------------|-----|
|      | The IIP3 is at -4.15 dBm                                                      | 114 |
| 5.19 | Micrograph of the fabricated chip occupying 1.5 mm $\times$ 1.5 mm and wire-  |     |
|      | bonded to a 1 $\times$ 4 PD array with a 250 $\mu m$ pitch                    | 115 |
| 5.20 | Single sub receiver measurements setup                                        | 117 |
| 5.21 | Full system measurement setup                                                 | 118 |
| 5.22 | BER measurements for PRBS 7 input                                             | 119 |
| 5.23 | BER measurements for PRBS 15 input                                            | 119 |
| 5.24 | 5.5 Gb/s output quarter-rate eye diagram                                      | 120 |
| 5.25 | Bathtub measurements at 22 Gb/s                                               | 120 |
| 5.26 | BER curve comparing single channel operation with full system operation and   |     |
|      | crosstalk penalty at 22 Gb/s and with a PRBS 7 sequence                       | 121 |
| 5.27 | Layout of a proposed split-delay SiP structure including a grating coupler,   |     |
|      | three directional couplers acting as power splitters, two one-bit delay lines |     |
|      | and four photodetectors                                                       | 130 |
| 5.28 | SiPh chip with Under Bump Metallization (UBMs) used to connect to the IC      |     |
|      | chip in [48]                                                                  | 131 |
| 5.29 | Electronically tunable delay lines consisting of a ring resonator and an MZI  |     |
|      | delay elements [29]                                                           | 132 |

List of Figures xxi

| 5.30 | Layout of a proposed split-delay SiP with directional couplers to ensure equal          |         |
|------|-----------------------------------------------------------------------------------------|---------|
|      | power at the output                                                                     | 133     |
| 6.1  | (a) Layout of the fabricated antenna. (b) Stack layers of the AMF SiP                   |         |
|      | fabrication process used to fabricate the antenna                                       | 138     |
| 6.2  | The simulated donut-shaped gain of the antenna perpendicular to the plane               |         |
|      | of the antenna. This gain is valid in the far region of operation                       | 141     |
| 6.3  | (a) Measured and simulated (Sim) S-parameters with the antennas placed                  |         |
|      | $0.3175~\mathrm{cm}$ (0.125") apart. (b) Measured peak S21 at 15 GHz at three different |         |
|      | distances                                                                               | 143     |
| 6.4  | Measurements setup for inter-chip communication with an external                        |         |
|      | photodetector as a transmitter directly driving the antenna. GSG probes                 |         |
|      | were used to drive the antennas as shown above the antenna pair symbol                  | 145     |
| 6.5  | Inter-chip antenna measured BER and eye diagrams curve for two different                |         |
|      | distances of antenna                                                                    | 146     |
| 7.1  | Schematic of the circuit model for the designed RC low pass filters. Dashed             |         |
|      | boxes outline the PD and LPF configurations                                             | 152     |
| 7.2  | (a) Cross-section view of the MIM capacitor and layout view of the LPF                  |         |
|      | structure (GC: grating coupler, PD: photodiode, M1: Metal 1, M2: Metal 2).              |         |
|      |                                                                                         | . 1 = 4 |
|      | (b) Micrograph of the RC filter structure in an active silicon photonics process        | 104     |

List of Figures xxii

| 7.3 | Measured and fitted $S_{21}$ parameter using the lumped model shown in Fig. 7.1          |     |
|-----|------------------------------------------------------------------------------------------|-----|
|     | and the parameters values in Table 7.1 for chip no. 3 at 2 V reverse bias                | 158 |
| 7.4 | Measured and fitted $S_{22}$ parameter using the lumped model shown in Fig. 7.1          |     |
|     | and the parameters values in Table 7.1 for chip no. 3 at 2 V reverse bias                | 159 |
| 8.1 | Schematic diagram of the moving average structure and the cross-section of               |     |
|     | the MIM capacitors in the silicon photonics process                                      | 165 |
| 8.2 | Calculated group delay per mm for 3 $\mu\mathrm{m}\times$ 220 nm low-loss rib waveguides |     |
|     | used for the delay (DL). The inset shows a simulation of the cross-section of            |     |
|     | the fundamental mode                                                                     | 167 |
| 8.3 | Micrograph of the wire-bonded silicon photonic die on the PCB with an                    |     |
|     | enlarged view of the fabricated moving average structure. GC: grating                    |     |
|     | coupler; DC: directional coupler; DL: delay line; PD1, PD2: photodiodes; G:              |     |
|     | ground pads; S: signal pads; V: DC voltage pads; $C_f$ : integrating capacitor           | 168 |
| 8.4 | Schematic diagram of the moving average structure including the parasitics               |     |
|     | and the high-impedance probe input resistance where $R_L=100~k\Omega$ is the             |     |
|     | high-impedance probe input impedance, $C_f = 200 \ fF$ is the moving average             |     |
|     | integrator, $C_{j1} = C_{j2} = 15 \ fF$ are the photodiode junction capacitances,        |     |
|     | $R_{s1}=R_{s2}=125~\Omega$ are the photodiodes series resistances, and $=C_{pad1}=$      |     |
|     | $C_{pad2} = 15 \ fF$ are the pads capacitance                                            | 170 |

| 8.5 | Simulation results for random sequences (a) at 2.5 Gbps with two decision-                        |
|-----|---------------------------------------------------------------------------------------------------|
|     | making levels of $j_1$ and $j_2$ , (b) at 5 Gbps with three decision-making levels of             |
|     | $m_1$ , $m_2$ , and $m_3$ , and (c) at 10 Gbps with five decision-making levels of $n_1$ ,        |
|     | $n_2$ , $n_3$ , $n_4$ , and $n_5$ . The insets show the corresponding simulated eye diagrams. 170 |
| 8.6 | Experimental setup for the characterization of the moving average integrator. 172                 |
| 8.7 | (a) Experimental validation of the 400 ps optical delay line (length of 32.3                      |
|     | mm at 1550 nm) using 50 $\Omega$ impedance GSG probes. Experimental versus                        |
|     | simulation results: the purple graphs are the measured output and the                             |
|     | overlapped red graph is the simulated output at (b) 2.5 Gbps and (c) 5 Gbps. 173                  |
| 9.1 | A CDR that could be used with the proposed receiver in [7] based on [34] 178                      |
| 9.2 | A DLL that could be used with the proposed receiver based on [34] 179                             |
|     |                                                                                                   |

# List of Tables

| 3.1 | Performance summary and comparison                                    | 60  |
|-----|-----------------------------------------------------------------------|-----|
| 4.1 | Component values used in Fig. 4.4 and to plot Fig. 4.5                | 72  |
| 4.2 | Performance summary and comparison                                    | 83  |
| 5.1 | Performance summary and comparison (part 1)                           | 125 |
| 5.2 | Performance summary and comparison (part 2)                           | 126 |
| 7.1 | equivalent circuit parameters                                         | 156 |
| 7.2 | Measured versus designed capacitance (pF) and resistance ( $\Omega$ ) | 158 |
| 7.3 | Filter cut-off frequencies (GHz)                                      | 160 |

# List of Acronyms

AC Alternating current

ADC Analogue to digital converter

**BER** Bit error rate

**BOX** Buried oxide

**BPG** Bit pattern generator

**CID** Consuctive identical bits

**CMOS** Complementary metal oxide semiconductor

CML Current mode logic

**CW** Continuous wave

**ED** Error detector

**EDFA** Erbium-doped fiber amplifier

**DCA** Digital communication analyzer

**Demux** Demultiplexing

**DFE** Decision feedback equalizer

**DOM** Dynamic offset modulation

**DRC** Design rule check

**FEC** Forward error correction

**ISI** Intersymbol interference

IC Integrated circuit

I/O Input-output

LA Limiting amplifier

MIM Metal insulator metal

MOSFET Metal oxide semiconductor field effect transistor

MUX Multiplexer

MZI Mach-Zehnder interferometer

MZM Mach-Zehnder modulator

**NF** Noise figure

NRZ Non-return-to-zero

**ODL** Optical delay line

OOK On-off keying

OMA Optical modulation amplitude

PAM Pulse amplitude modulation

PC Polarization controller

PCB Printed circuit board

**PD** Photodetector

PLL Phase-locked loop

**PPG** Pulse pattern generator

PRBS Pseudorandom binary sequence

**RF** Radio frequency

RX Receiver

**RZ** Return-to-zero

SiGe Germanium on Silicon

SiP Silicon photonics

**SNR** Signal-to-noise ratio

**SOI** Silicon on insulator

**TIA** Transimpedance amplifier

Tx Transmitter

VOA Variable optical attenuator

**XOR** Exclusive OR boolean logic

## Chapter 1

## Introduction

Over the past few years, internet traffic has increased significantly and was projected to increase from 4.7 Zettabyte in 2015 to 15.3 Zettabyte in 2020 as shown in Fig. 1.1 [1]. This increase is caused by the expanding use of high definition online streaming, cloud storage, cloud computing, social media, and gaming streaming services among other data-hungry services. This projection does not account for the on-going pandemic that would increase data usage due to more people working from home and having more time to use online services.

Interestingly, most data traffic, 77 %, takes place within the data center as shown in Fig. 1.2 [1]. To support traffic of this magnitude within data centers, optical interconnects offer an attractive alternative to copper interconnects that suffer from many limitations at high-speed. In order to accommodate these optical interconnects within data-centers,



Fig. 1.1: Data traffic per year in data centers [1].



Fig. 1.2: Data traffic breakdown by destination [1].

high-speed and energy-efficient optical receivers are required.

### 1.1 Motivation

The advent of optical interconnects in data centers requires the design of energy-efficient optoelectronic transceivers. In the near term, to improve the bandwidth and reduce the power consumption of optoelectronic transceivers, optical I/O solutions must integrate

electronic and photonic elements in the package [2]. This integration provides an opportunity where photonic elements can be used to perform optical data processing potentially eliminating or replacing certain electronic circuit blocks in optoelectronic transceivers and improving their overall energy efficiency. For example, in [2], modulators, waveguides, and photodetectors are integrated with a CMOS transceiver. In [3], an add/drop wavelength filter is integrated with a CMOS receiver. In [4], an optical wavelength interleaver is integrated with a CMOS transceiver.

Silicon photonics is a logical candidate to realize this approach as its manufacturing leverages the existing CMOS technology infrastructure.

This motivates the co-design of electronics and photonics to meet the requirements of optical interconnects in data centers. It also motivates exploring the feasibility of developing passives in silicon photonics that are usually found on the electronic integrated circuit side as it may lead to the development of better integrated co-designed electronic/optical systems

### 1.2 Thesis objectives

The primary objective of this thesis is to develop an energy-efficient, high-speed, high-sensitivity optical receiver that leverages silicon photonics to improve the performance of the system. More specifically, the thesis will attempt to validate the following two hypotheses:

Clock generation for optical receivers can be constructed from optical delay lines instead
of electronic delay lines to achieve less power consumption and higher energy efficiency.

• An optical receiver can be constructed in silicon that uses a low-bandwidth front-end electronic circuit of less than 0.7×data-rate to achieve the given data-rate throughput.

To this goal, three integrated circuits were developed and measured. The first receiver in [5] explores the potential of using silicon photonic delay lines to replace quadrature clock phase generation circuits in optical receivers. The second receiver in [6] applies the same concept to a two clock phase optical receiver and simplifies the first design in an attempt to achieve better power consumption and sensitivity. The third receiver in [7] employs the same concept but also presents a novel low-bandwidth front-end to boost the speed of the receiver.

A secondary objective is to explore the potential of implementing passive components in the silicon photonic platforms. This exploration may lead to better co-designed optical/electronic systems and reduced cost. More specifically, the thesis will attempt to validate the following two hypotheses:

- An on-chip antenna can be integrated in a silicon photonic fabrication technology.
- An RC filter and moving average filter can be integrated in a silicon photonic fabrication technology.

To validate this, three different implementations are explored. The first is an on-chip antenna in the silicon photonics platform. This studies the feasibility of doing wireless RF inter-chip data transmission in the silicon photonics platform. The second implementation is a low-pass filter driven by a photodetector. This implementation studies the idea of relocating bulky passives such as capacitors and resistors from the integrated circuit side to the silicon photonics platform to potentially save cost. Finally, the third implementation combined optical delay lines, photodetectors, and a capacitor to develop a moving average filter. A moving average filter in the silicon photonics platform may be valuable for the emerging low-bandwidth optical receivers and has the potential to enable the design of receivers with superior performance.

These objectives are summarized below:

- Develop optical receivers that leverage silicon photonics.
  - Develop an optical receiver that uses silicon photonic delay lines to replace quadrature clock phase generation in optical receivers.
  - Develop a single-phase optical receiver the leverages silicon photonics to simplify
    the design which may lead to improved performance.
  - Develop an optical receiver with a novel low-bandwidth integrating front-end that may lead to achieving higher data rates.
- Develop passives in the the silicon photonics platform to explore their potential. This

may enable the development of better integrated co-designed systems.

 Develop an RF antenna in the silicon photonics platform and explore the possibility of inter-chip transmission.

- Develop an RC filter in the silicon photonics platform.
- Develop a moving average filter in silicon photonic that may prove valuable at the front-end of the emerging low-bandwidth optical receivers.

#### 1.3 Claim of Originality

The high-level engineering attempt to boost performance is the co-design and co-optimization of electronics and photonics. This design mentality allowed the proposed systems to achieve better performance than other receivers. The specific advances and conurbations are detailed below:

• Two optical receivers with conventional front-ends that leverage optical delay lines were developed. This novel technique simplifies clock phase generation in optical receivers to achieve improved energy efficiency and reduce clock generation power consumption. The first receiver, presented in chapter 3, is a first demonstration that used silicon photonic delay lines to replace clock phase generation circuits. While the silicon photonic delay lines were published by the group prior to the start of this Ph.D., the contribution presented in this thesis is the first demonstration of the IC

chip leveraging the optical delay lines. This is more of a demonstration of feasibility rather than an engineering advance attempt attempt since the design approach is an invention (patent no 9,917,650) and this PhD prototyped the novel front-end. The second receiver presented in chapter 4 is a simplified receiver that leverages optical delay lines to achieve a demultiplex-by-two operation. The engineering advance here is the simplicity of the design and the removal of the voltage gain stages. The result of this is the superior energy efficiency achieved by the receiver. Both receivers are demonstrated experimentally [5, 6]. The authors of [6] believe that the energy efficiency achieved of 156 fJ/bit is the best reported compared to the state-of-the-art.

- A novel optical receiver with a low-bandwidth two-bit integrating optical front-end was developed that in addition to leveraging optical delay lines for clock generation, employs a novel low-bandwidth front-end to achieve high-speed of operation. This third receiver, presented in chapter 6, utilizes an integrating front-end to achieve high-speed in the CMOS 65nm technology node. The engineering advance is the innovative integrating front-end described in the chapter. This receiver was verified through experimental measurements with a photodetector array [7] and achieves a speed of 22 Gb/s with an energy efficiency of 1.43 pJ/bit.
- An on-chip antenna in silicon photonics stack was developed and inter-chip was demonstrated for the first time in this stack. Antenna design, fabrication, and measurements are described in detail [8]. The antenna may be used for inter-chip

communication applications involving a central control unit or microprocessor and several optical receivers without the need for wire bonding.

- An RC filter and a moving average filter were developed in the silicon photonics platform. These two implementations allows for the relocation of some passives from the integrated chip side to the photonic chip side potentially saving cost and allows for better designed integrated systems. Both filters were designed, fabricated, and verified experimentally [9, 10]. The moving average filter and the low-pass filter are exploratory research ideas that could potentially be eventually integrated within the optical receiver. For example, the moving average filter could be used to replace the reset function in the receiver that is proposed in chapter 5.
- The passive structures in silicon photonics designs are not engineering advances that attempt to improve performance, but rather explorative and innovative designs that attempt to assess and measure the feasibility of passives in the SiPh platform.

#### 1.3.1 Publications and contributions of the author

The contents of this thesis are presented in several publications that include six journal articles, and one conference proceeding. The following is a list of publications and contributions of the author. The conference paper used in this thesis, [11], acts as a survey of electronic and optical delay lines and is presented in background section. Each of the

optical receivers is presented in a separate journal paper [5–7], and each of the three passives implemented in the silicon photonic stack is presented in a separate journal paper [8–10].

Additionally, the author authored and co-authored four additional conference publications not related to this thesis [12–15].

#### Peer-reviewed Journal Articles:

- [5] B. Radi, M. S. Nezami, M. Ménard, F. Nabki and O. Liboiron-Ladouceur, "A 12.5 Gb/s 1.93 pJ/bit Optical Receiver Exploiting Silicon Photonic Delay Lines for Clock Phases Generation Replacement," in IEEE Transactions on Circuits and Systems II: Express Briefs, doi: 10.1109/TCSII.2019.2952591 (Early access).
  - B. Radi: Proposed the idea, designed and drew the layout of the receiver, performed all the measurement, and wrote the manuscript.
  - M. S. Nezami: Assisted with the measurements.
  - M. Ménard: Provided feedback, edited, and reviewed the manuscript.
  - F. Nabki: Provided feedback, edited, and reviewed the manuscript.
  - O. Liboiron-Ladouceur: Supervised the work, edited, and reviewed the manuscript.
- [6] M. Taherzadeh-Sani, B. Radi, M. S. Nezami, M. Ménard, O. Liboiron-Ladouceur and F. Nabki, "A 17 Gbps 156 fJ/bit Two-Channel Optical Receiver With Optical-Input

Split and Delay in 65 nm CMOS," in IEEE Transactions on Circuits and Systems I: Regular Papers, doi: 10.1109/TCSI.2020.2976197 (Early access).

- M. Taherzadeh-Sani: Designed and drew the layout of the receiver, performed measurements, and wrote the manuscript.
- B. Radi: Involved with the design and the measurements of the receiver. Assisted with drafting the manuscript.
- M. S. Nezami: Assisted with the measurements.
- M. Ménard: Provided feedback, edited, and reviewed the manuscript.
- O. Liboiron-Ladouceur: Provided feedback, edited, and reviewed the manuscript.
- F. Nabki: Supervised the work, edited, and reviewed the manuscript.
- [7] B. Radi, M. Taherzadeh-Sani, M. S. Nezami, F. Nabki, M. M'enard, and O. Liboiron-Ladouceur, "A 22 Gb/s time-interleaved low-power optical receiver with a two-bit integrating front-end," IEEE Journal of Solid-State Circuits (Accepted, ID: JSSC-19-0447.R2).
  - B. Radi: Proposed the idea, designed and drew the layout of the receiver, performed all the measurements, and wrote the manuscript.
  - M. S. Nezami: Assisted with the measurements.
  - M. Taherzadeh-Sani: Provided feedback on the design and assisted with revisions.

- F. Nabki: Provided feedback, edited, and reviewed the manuscript.
- M. Ménard: Provided feedback, edited, and reviewed the manuscript.
- O. Liboiron-Ladouceur: Supervised the work, edited, and reviewed the manuscript.
- [8] B. Radi, A. S. Dhillon and O. Liboiron-Ladouceur, "Demonstration of Inter-Chip RF Data Transmission Using On-Chip Antennas in Silicon Photonics," in IEEE Photonics Technology Letters, vol. 32, no. 11, pp. 659-662, 1 June, 2020, doi: 10.1109/LPT.2020.2991118.
  - B. Radi: Proposed the idea, designed, and drew the layout of the antenna, performed all the measurements, and wrote the manuscript.
  - A. S. Dhillon: Assisted with the measurements.
  - O. Liboiron-Ladouceur: Supervised the work, edited, and reviewed the manuscript.
- [9] M. S. Nezami, B. Radi, A. Gour, Y. Xiong, M. Taherzadeh-Sani, M. Ménard, F. Nabki, and O. Liboiron-Ladouceur, "Integrated RF Passive Low-Pass Filters in Silicon Photonics," in IEEE Photonics Technology Letters, vol. 30, no. 23, pp. 2052-2055, 2018, doi: 10.1109/LPT.2018.2875895.
  - M. Sanadgol Nezami: Measured the filter and wrote the manuscript.
  - B. Radi: Assisted with the design, with measurements, and with the drafting of the manuscript.

- A. Gour: Assisted with the design.
- Y. Xiong: Assisted with the design.
- M. Taherzadeh-Sani: Provided feedback on the manuscript.
- M. Ménard: Provided feedback, edited, and reviewed the manuscript.
- F. Nabki: Provided feedback, edited, and reviewed the manuscript.
- O. Liboiron-Ladouceur: Supervised the work, edited, and reviewed the manuscript.
- [10] M. S. Nezami, B. Radi, M. Taherzadeh-Sani, Y. Xiong, M. Ménard, F. Nabki, and O. Liboiron-Ladouceur, "A high-speed moving average integrator in silicon photonics for TIA-less receivers," IEEE Photonics Technology Letters (Accepted, ID: PTL-37154-2020.R1).
  - M. Sanadgol Nezami: Measured the filter and wrote the manuscript.
  - B. Radi: Generated the idea, assisted with the design, with measurements, and with the drafting of the manuscript.
  - Y. Xiong: Designed the filter.
  - M. Taherzadeh-Sani: Provided feedback on the manuscript.
  - M. Ménard: Provided feedback, edited, and reviewed the manuscript.
  - F. Nabki: Provided feedback, edited, and reviewed the manuscript.
  - O. Liboiron-Ladouceur: Supervised the work, edited, and reviewed the manuscript.

#### Peer-reviewed Conference Articles:

[11] B. Radi, A. S. Dhillon, and O. Liboiron-Ladouceur, "Towards integrated rf photodetector-antenna emitters in silicon photonics," in 2020 IEEE Photonics Conference (IPC) (Accepted, Paper ID = 147, Conference date: September 28th – October 1st, 2020).

- B. Radi: Wrote the manuscript.
- A. S. Dhillon: Assisted with the measurements.
- O. Liboiron-Ladouceur: Supervised the work, edited, and reviewed the manuscript.
- [16] B. Radi and O. Liboiron-Ladouceur, "A survey of optical and electronic delay lines with a case study on using optical delay lines in 65nm CMOS optical receivers," in 2020 IEEE International Midwest Symposium on Circuits and Systems(MWSCAS) (Accepted, Paper ID = 3226, Conference date: August 9th August 12th, 2020).
  - B. Radi: Wrote the manuscript.
  - O. Liboiron-Ladouceur: Supervised the work, edited, and reviewed the manuscript.

#### Peer-reviewed Conference Articles not related to this thesis:

[12] Y. Xiong, F. G. de Magalhães, B. Radi, G. Nicolescu, F. Hessel, and O. Liboiron-Ladouceur, "Towards a fast centralized controller for integrated silicon photonic multistage MZI-based switches," in 2016 Optical Fiber Communications

Conference and Exhibition (OFC), pp. 1–3, 2016.

- [13] V. E. Paul, B. Radi, V. Tolstikin, and O. Liboiron-Ladouceur, "A technology-based comparative study for the optoelectronic integration of optical front-ends," in 2016 Photonics North (PN), pp. 1–1, 2016.
- [14] B. Radi, V. E. Paul, V. Tolstikhin, and O. Liboiron-Ladouceur, "Comparative study of optoelectronics receiver front-end implementation in InP, SiGe, and CMOS," in 2016 IEEE Photonics Conference (IPC), pp. 222–223, 2016.
- [15] H. R. Mojaver, A. Das, B. Radi, V. Tolstikhin, K.-W. Leong, and O. Liboiron-Ladouceur, "Scalable SOA-based lossless photonic switch in InP platform," in Optical Interconnects 2020 (Accepted, Paper ID = 25, Conference date: September 27th Oct 1st, 2020).

# 1.4 Thesis Organization

This thesis consists of two themes: the co-design of electronics and photonics for optical receiver design and the implementation of passives in silicon photonics. Chapter 2 reviews some background information necessary to understand the contents of this thesis. The first theme is covered in Chapter 3, Chapter 4, and Chapter 5. Chapter 3 describes the design of a 12.5 Gb/s demux-by-four receiver with a conventional front-end that leverages silicon photonic delay lines for clock phase generation. Chapter 4 describes a 17 Gb/s demux-by-

two receiver that leverages the optical delay lines to simplify the receiver and the clocking to achieve superior energy efficiency. Chapter 5 describes a novel optical receiver with a low-bandwidth two-bits integrating front-end and achieves a speed of 22 Gb/s. All three receivers are implemented in CMOS 65 nm and are verified experimentally. The second theme is covered in Chapter 6, Chapter 7, and Chapter 8. Chapter 6 describes the design of a 15 GHz RF antenna in silicon photonics. This chapter also describes, for the first time, the demonstration of inter-chip data transmission in silicon photonics. Chapter 7 presents the design of a low-pass RC filter in the silicon photonics process. Chapter 8 is an extension of chapter 7 and describes the implementation of a moving average filter in silicon photonics using delay lines and on-chip capacitor. Finally, chapter 9 proposes suggested future work and concludes the thesis.

# Chapter 2

# Background

This chapter presents some background information necessary to understand and appreciate the contents of this thesis. The first section provides general and basic information about the silicon photonics process. The second section is a survey of commonly used electronic and an overview of optical delay lines that will be used in the next three chapters. The third section is a more detailed look at passive optical delay lines and provides information about demultiplexing in optical receivers. Finally, the fourth section gives a general overview of optical receiver design and challenges.

# 2.1 An overview of silicon photonics

This section provides a brief overview of the silicon photonics process. The silicon photonics process by Advanced Micro Foundry (AMF) (formerly IME) is shown in Fig. 2.1 with some



Fig. 2.1: Silicon photonics process cross-section [17].

device examples [17].

The process consists of a silicon-on-insulator (SOI) wafer with a 220 nm silicon layer for devices and 2  $\mu$ m buried oxide (BOX) layer. The device layer can be etched at 0 nm, 90 nm, and 160 nm in addition to the standard etching at 220 nm. The process has 6 implants for optical modulators (P++, P+, P, N++, N+, N) and Germanium deposition and implanting for photodetectors. The process also provides contact vias and two Aluminum layers.

The stack of the Silicon Photonics process is shown in Fig. 2.2. The process consists of a Si substrate that is 120  $\mu$ m thick. On top of the substrate, there is a silicon oxide layer (BOX) that is 2  $\mu$ m thick, followed by a 220 nm Si layer. This layer can be etched to build waveguides and grating couplers or implanted to make different components such as photodetectors. Ge can be deposited on top of this Si layer to complete the structures of



Fig. 2.2: Silicon photonics process cross-section [17].

the photodetector. The two metal layers can be connected to the 220 nm Si layer and each other using vias. The metal layers are covered with oxide cladding.

The process allows for the fabrication of many devices such as the ones shown in Fig. 2.1.

A brief description of some of the devices that can be fabricated in the silicon photonics process is provided next.

Grating couplers is a structure that allows the light to be coupled in and out of the chip. Grating couplers are built by alternating the etching height which essentially means periodically alternating the refractive index. This will lead to strong frequency (or wavelength) selection and the incident light will be refracted along the coupler and eventually be guided along a waveguide or from waveguide to free space.

Channel waveguides and rib waveguides are used to guide light on the silicon photonic chip and serve as interconnects between different blocks on the chip. These waveguides can have different insertion loss depending on the width and the structure used. For example,



Fig. 2.3: Photodetector model in [19].

the insertion loss in the waveguides used in [18] is 3 dB/cm for 220 nm  $\times$  500 nm channel waveguides and 0.2 dB/cm for 220 nm  $\times$  3  $\mu$ m waveguides. These waveguides can be used to build passive optical delay lines as detailed later in the chapter.

Photodetectors are key components used at the interface of optical chips and electronic chips. Photodetectors are used to convert light into current through the absorption of photons. They are characterized by their operating wavelength, bandwidth, and responsivity. A photodetector, a model of which is shown in Fig. 2.3, is usually modeled as a current source,  $I_{pd}$ , in parallel with a junction capacitance,  $C_{pd}$ , used to represent the capacitance of the reverse-biased PN junction. The parallel combination of the current source and junction capacitance is in series with a series resistance,  $R_{pd}$ , that represents the effective resistance of the junction. Finally, an inductor,  $L_p$ , is used to represent the peaking inductor (if used) and any bond wire parasitic inductance.

The responsivity, R, of the photodetector relates the photocurrent generated to the incident optical power,  $P_{opt}$ , incident as shown in equation 2.1.

$$I_{pd} = R \times P_{opt} \tag{2.1}$$

Electro-optic modulators are devices used to modulate the continuous light by applying a voltage to a certain region of the device. The applied voltage will result in a change in the refractive index and the change in the refractive index leads to phase change. The phase change can be made out to be 0 degrees or 180 degrees by applying different voltages. It is then possible to combine this modulated signal with a copy of the same signal to achieve on-off keying modulation.

Directional couplers are devices that can be used as optical power splitters. Directional couplers are created by placing two waveguides with a certain length in close proximity. The fields will transfer from one waveguide to the other as they travel along the waveguide. By choosing the proper length, the amount of field energy transferred can be effectively controlled. This sets the power splitting ratio.

Finally, Y-branches are passive optical devices used to split the optical power in half.

This section provided a brief overview of the silicon photonic process and some of the devices that could be implemented. Of importance to this thesis are waveguides, photodetectors, directional couplers, and Y-branches. The following sections provide more details about optical delay lines that are built using waveguides.

#### 2.1.1 Motivation for passives electronics in Silicon Photonics

Several aspects make silicon photonics an attractive platform for the implementation of passives. The first aspect is the low cost per area which makes it a suitable platform to host bulky passive components such as capacitors, inductors, and antennas. For example, according to [20] the cost per mm<sup>2</sup> for the photonic chip provided by Advanced Micro Foundry (AMF) is \$225 CAD as opposed \$8,775 CAD for a chip implementation in TSMC 28 nm CMOS Process Technology. This is almost 40 times more expensive in this case. Moreover, with more advanced technology nodes, the area that passives occupy becomes increasingly larger compared to the numbers of transistors that could fit within the same area. This makes the implementation of passives inefficient in advanced technology nodes. To give a sense of the size passives can occupy compared to transistors, Fig.2.4 shows the layout standard N-channel MOSFET with a length of 60 nm and a width of 200 nm next to the layout of two metal-insulator-metal (MIM) capacitors in a CMOS 65 nm technology process. The first is a 10 fF capacitor with length and width of 2  $\mu$ m and the second is a 200 fF capacitor. It is evident that even small capacitors occupy much larger valuable area compared to transistors giving motivation to relocating them when possible. Passives such as antennas and inductors could occupy even larger areas. Further, as silicon photonics has a typical minimum feature size above 100 nm, bulky passives fabrication can be done with less advanced photolithography tools compared to advanced CMOS processes.

Another aspect that makes silicon photonics advantageous to use for the implementation



**Fig. 2.4:** Layout of an N-channel MOSFET next to two MIM capacitors illustrating that passives can much larger area compared to transistors.

of passives is the low substrate conductivity. In some CMOS processes, the substrate is made conductive to avoid latch-up issues that may damage the chip. While this conductivity is not a problem for digital circuits, it presents a challenge and degrades the performance of analog circuits and RF passives. The silicon photonics process is characterized by its low conductance substrate. This makes it suitable for the implementation of passives with a high quality factor. For example, antennas favor low conductivity substrate. If the substrate is conductive, electromagnetic power is dissipated as heat degrading the radiation efficiency of

the antenna [21]. To avoid this in CMOS processes with high conductivity, etching could be done to thin the substrate to reduce heat losses, but this increases cost. This is avoided when the antenna is implemented in silicon photonics. There are other advantages for low conductivity such as lower crosstalk between components and the suppression of substrate noise.

The benefits of relocating passives from the IC side to the silicon photonics side are summarized in Fig.2.5.

It should be noted the trade-off will be in achieving efficient, cost-effective packaging of the IC chip and the silicon photonics chip. This will depend on the specific application and may lead to other trade-offs such as increased complexity or higher power consumption.

To show the feasibility of designing passives in silicon photonics, experimental demonstrations of implementations of an RC filter driven by a photodetector, a moving average filter, and an antenna are presented in this thesis.

## 2.1.2 A note on process variations in Silicon Photonics

Process variations are an essential consideration when designing passives in the silicon photonics platform. It is, therefore, critical to design for robustness. This can be done by running corner simulations and Monte Carlo simulations and then taking the results into account during the design phase. Additionally, because the process lacks MOSFET switches, it might be needed to include feedback circuits from the IC to the SiPh side to



**Fig. 2.5:** Summary of benefits of relocating passives from the IC side to the silicon photonics side.

stabilize the performance of the passives. An example of this technique is presented in [3], where a thermal tuning loop is used to stabilize the microring drop filter resonance wavelength.

## 2.2 Motivation for optical versus electronic delay lines

Delay lines are used in many applications such as time-to-digital converters (TDCs) and digital-to-time converters (DTCs) for the digitization of short time intervals. They are also used in clock generation and clock distribution applications. Moreover, they are used in signals deskew applications and for edge alignment.

Optical delay lines will be used heavily in the next three chapters. Consequently, in this section, several CMOS and optical delay lines are reviewed and compared in terms of

resolution, delay range, power consumption, and tunability. This comparison will highlight the potential benefits of optical delay lines as compared to electronic delay lines. Subsection 2.2.1 describes some of the most used electronic delay lines. Subsection 2.2.2 describes recent developments of optical delay lines. Subsection 2.2.3 briefly addresses some of the considerations for selecting the optimum delay line for a given application.

#### 2.2.1 CMOS Delay Lines

Electronic delay lines are the most commonly used due to their low complexity and low cost. These delay lines can have a single output or can have multiple outputs where the output corresponding to the required delay is selected. Delay line elements can be tuned with an analog signal or can be digitally controlled. Four different delay line architectures are reviewed in this subsection.

#### Inverter-based tapped delay line and single output delay line architectures

In the inverter-based tapped delay line architecture [22], delay line elements are cascaded and the output corresponding to the required delay is selected. The most commonly used delay element is an inverter, but other delay elements such as flip-flops [23] can be used as well. This architecture is shown in Fig. 2.6. A mux is needed to select the required delay. Alternatively, the single output implementation is shown in Fig. 2.7 and could be used to eliminate the multiplexer. This delay line is digitally controlled using tri-state inverters that



Fig. 2.6: Inverter-based tapped delay line.

are enabled and disable based on the required delay.

These architectures can have a wide delay range and the range increases with the number of stages, but the resolution is limited to twice the gate delay of the delay element. Since the resolution is set by the gate delay, it improves with the technology node where smaller nodes allow for finer resolutions. The power consumption of this type of delay line is high and increases with delay range as more delay elements are needed. These two delay lines can only be exclusively used with digital signals such as clocks.

#### Sub-gate resolution two-path delay line

The architecture in Fig. 2.8 allows for sub-gate delays. In this architecture [24], the digital input signal is fed to two different paths with a different delay (fast path and slow path).



Fig. 2.7: Single output delay line.

MOS capacitors are used to slow down the signal in the lower slow path. The difference between the two paths is less than a gate delay and hence a sub-gate delay is achieved. The control signal is used to enable the appropriate path based on the required delay. The delay range of this technique is limited, and the power consumption is higher than the previous architectures discussed for a given range. This delay line does not scale linearly.



Fig. 2.8: Two path delay line capable of sub-gate delays.

#### Analog delay buffer based delay lines

A delay line element that can be used to build analog delay lines is the analog buffer shown in Fig. 2.9 [25]. This delay line is controlled by an analog signal and the delay is adjusted by varying the control voltages  $V_C$  and  $V_{CB}$  which in turn changes the load of the circuit changing the speed of the buffer. Analog buffer-based delay lines can have good resolution but are power-hungry due to static power consumption.



Fig. 2.9: Analog buffer that can be used to delay analog signals and controlled through adjusting the load.

#### Supply voltage controlled and current starved delay lines

In these types of delay lines, either the supply voltage is used to control the delay of the delay line (Fig. 2.10) using an analog signal [26], or the biasing current is changed using a digital signal (Fig. 2.11) [27]. In either case, the current drawn is changed and the rate at which the load capacitor is charged changes accordingly. The first technique requires a supply source capable of providing substantial amounts of current and the resulting delay is not as fine as other techniques. The second technique is reported to achieve good resolution and range [27].



Fig. 2.10: Supply controlled delay line.

## 2.2.2 Optical delay lines

Optical delay lines can be divided into mechanically controlled free space delay lines, passive optical delay lines, and electronically controlled delay lines. Optical delay lines modeled throughout the thesis as shown in Fig. 2.12



Fig. 2.11: Current starved delay line.



Fig. 2.12: Optical delay line model.

#### Mechanically controlled free space delay lines

In this kind of optical delay lines, a gap opening is controlled mechanically changing the distance the light must traverse and thus controlling the delay. Products of these delay lines are readily available (e.g. Santec ODL-330 [28]) and can have delay ranges of 400 ps. The resolution is mechanically controlled and can be as small as 0.2 ps. As those delay lines are passive, they consume no power. Free space delay lines can attenuate the signal and have an insertion loss in the order of 1.5 dB. While those delay lines can have an impressive resolution and range, without consuming power, they are not suitable for integrated systems

due to their large gap size (45 mm in case of ODL-330) and the need for mechanical delay control.

#### Passive integrated optical lines

This delay line is implemented using an optical waveguide of a certain length corresponding to a fixed required delay. This delay line can be implemented on-chip and is suitable for integration with electronic receivers and systems. Those delay lines can have a small size with a compact layout depending on the required delay. For example, in [18] for a 100 ps delay line of 7.2 mm in length, the rectangular nested layout has a size of 250  $\mu$ m × 250  $\mu$ m. This delay line has low insertion loss as well which can be as low as 0.2 dB for 50 ps delay. Since those delay lines are passive, they consume no power. Those delay lines are not tunable but can have accurate delays. An error of 3 ps can be expected for a 50 ps delay [18]. This kind of delay lines will be used extensively in this thesis and is described in more details in subsection 2.4.

#### Electronically controlled optical delay lines

Integrated optical tunable delay lines can be made tunable by using ring resonators and Mach–Zehnder interferometers (MZIs). One such implementation is reported in [29] and is shown in Fig. 2.13. In this implementation, a ring resonator is used to fine control the delay and can have a continuous delay range of up to 23 ps. The MZI array of eight elements

is used as a coarse delay where they are used to select the delay path. This technique is reported to allow for a continuous delay of up to 1.27 ns. The power consumption can range between 12–33 mW depending on the delay. This delay line has a high insertion loss of 12.4 dB.



Fig. 2.13: Electronically tunable optical delay lines.

#### 2.2.3 Trade-off considerations

When selecting the appropriate delay line, the requirements of the application need to be considered. Considerations include power consumption, resolution, range, tunability, tuning mechanism and signal, integrability, and type of input signal (electrical, optical, analog, digital).

All electronic delay lines reviewed can be used exclusively with digital signals, except for the analog buffer. Moreover, chaining analog buffers to increase the range limits the bandwidth of the chain making this type of delay lines only suitable for slow analog signals or short delays. Optical delay lines have no limitation on the type of information the optical signal carries, digital or analog.

As discusses in the previous subsection, electronic delay lines exhibit a trade-off between

delay range and power consumption. Higher delays and wider delay range usually result in higher power consumption. Free space and passive integrated optical delay lines do not suffer from this limitation and consume no power regardless of the delay or the range. This trade-off is still true for electronically controlled optical delay lines.

The resolution of electronic delay lines depends on the technique used. Mechanically and electronically tunable optical delay lines offer superior performance in this regard compared to electronic delay lines. Mechanically tunable delay lines have a resolution as small as 0.2 ps and electronically tunable optical delay lines offer continuous delay. Passive integrated optical delay lines are not tunable.

In terms of integration, electronic delay lines are simpler and suitable for digital systems and slow analog signals but can be limited in terms of resolution and bandwidth. Optical delay lines are more difficult to integrate as they mostly need to be implemented in a different technology such as silicon photonics and are also more difficult to control as they need external mechanical or electronic tuning. However, they could provide virtually infinite bandwidth, infinitesimal resolution, or zero power consumption.

In terms of noise, both free space and passive wave guides used in this thesis are passives and they do not contribute to noise. They only have insertion loss that reduces the optical power. They directly affect the sensitivity by lowering the optical power of the signal. The sensitivity equation is given by:

$$Sensitvity = -174dBm/Hz + NF + 10log(BW) + SNR$$
(2.2)

When delay lines with an insertion loss of L are used, then the sensitivity changes to:

$$Sensitvity = -174dBm/Hz + NF + 10log(BW) + SNR + L$$
(2.3)

The insertion loss has a significant impact when the SNR is low while becoming less important as the SNR increases.

The next section provides details about how passive optical delay lines could be used to replace clock phase generation circuits in optical receivers with demultiplexed output.

# 2.3 Optical receivers with demultiplexed output

This section provides a high-level overview of optical receivers. Two aspects are briefly discussed: bandwidth of the front-end and number of clock phases used in the receiver.

# 2.3.1 Brief overview of conventional source-synchronous optical receivers

Fig. 2.14 shows a conventional source-synchronous optical receiver that consists of a transimpedance amplifier (TIA) at the input connected to the photodetector. The TIA is

used to convert the photocurrent into voltage. The limiting amplifier stage (LA) follows the TIA and is used to amplify the voltage output of the TIA. The amplified voltage is then fed to several decision circuits each clocked with sub-rate clock phases. In Fig. 2.14, the number of clock phases is four. The sampling of incoming bits is demonstrated for a four clock phase system and two clock phase system on the left side and right side of Fig. 2.15, respectively. To generate those clock phases, a clock phase generation circuit is used. Finally, the sub-rate outputs of the decision circuits are fed to a buffer to drive the next stage or measurement equipment.

In conventional systems, the demultiplex-by-four system requires four clock phases and produces a quarter-rate output, while the demultiplex-by-two system requires two clock phases and produces a half-rate output. The demultiplex-by-four system has relaxed timing requirements, potentially allowing for higher speeds, but requires twice the number of latches.

## 2.3.2 Clock phases in optical receivers

The clock phase generation circuit consumes power and reduces the energy efficiency of the receiver. It is possible to use the architecture in Fig. 2.16 to eliminate clock phase generation circuit. In this architecture, the input signal is split into four sub-signal and each is delayed incrementally by one bit. This will allow the receiver to operate using one clock phase as illustrated in Fig. 2.17 for four quarter-rate outputs (left) and two half-rate outputs (right). The implementation of a silicon photonics structure that can be used to achieve this split



Fig. 2.14: Conventional optical receiver with demultiplexed outputs.



Fig. 2.15: Sampling in conventional four clock phase receiver (left) and two clock phase receiver (right).



Fig. 2.16: An optical receiver architecture that uses split and delay functionality to produce demultiplexed outputs.

and delay functionality is detailed in [18] and patented in [30]. This is briefly discussed in the next section. The receivers in Chapter 3 and Chapter 4 are designed to demonstrate this concept.

#### 2.3.3 Front-end bandwidth

Conventionally, the combined bandwidth of the TIA and the LA stages is designed to be  $0.7 \times \text{data}$  rate. This value is chosen to be high enough to avoid intersymbol interference (ISI) and low enough to avoid excessive noise within the bandwidth of operation. However, higher bandwidth needed to support higher data rates results in higher power consumption. Moreover, TIAs do not scale well with the technology node. Thus, there has been a recent



Fig. 2.17: Sampling in using silicon photonics structure with four quarter-rate outputs (left) and two half-rate outputs (right).

interest in developing what is called low-bandwidth optical receivers that attempt to reduce the bandwidth of the input stage by using techniques such as equalization and integrating front-end with a reset signal. An example of this type of receivers is presented in Chapter 5 with a complete discussion of low-bandwidth receivers in the literature.

## 2.4 Optical split and delay structure

The optical passive structure, shown in Fig. 2.18 for a demultiplex-by-four receiver, consists of an optical splitter that divides the signal into four followed by two four optical lines each used to delay the signal by one bit relative to each other, with the first output having no delay. The optical outputs are fed to a photodetector for detection.

The coupling ratio of the directional couplers is adjusted such that the output power is the same for each of the four outputs considering the insertion loss of the delay lines. Four variants of the delay lines were designed at 10 Gb/s and 20 Gb/s. The two variants

at 10 Gb/s have lengths of 7.2 mm and 8.2 mm for cross-sections of 220 nm  $\times$  500 nm and 220 nm  $\times$  3  $\mu$ m with 3.2 dB and 0.3 dB insertion loss, respectively. The two variants at 20 Gb/s have lengths of 3.6 mm and 4.1 mm for cross-sections of 220 nm  $\times$  500 nm and 220 nm  $\times$  3  $\mu$ m with 1.5 dB and 0.1 dB [18].



Fig. 2.18: Optical split-delay structure.

Optical delay lines used are passive and only suffer from attenuation and timing errors. Specifically, the timing error is approximately 3 ps according to [18]. They also suffer from insertion loss as described above. Curves describing the insertion loss are also provided in [18]. Moreover, optical delay lines are stable against temperature variations. Where the shift is only 0.6 ps for 100 C temperature change.

In terms of linearity, higher optical power leads to change in the refractive index. This is indicated by the following equation:

$$n = n_0 + n_2 \times I = n_0 + n_2 \times \frac{P}{\pi \omega^2}$$
 (2.4)

Where  $n_0$  is the linear refractive index,  $n_2$  is a constant related to the 3rd order nonlinear

susceptibility and depends on the material, I is the intensity of the optical signal, P is the optical power, and  $\omega$  is the radius of the mode. In non-linear systems, the typical value of I is  $10^{10}$  W/ $m^2$ . However, the maximum power used throughout this work presented in this thesis is 0 dBm (1 mW) with a typical width of 500 nm corresponding it  $I = 0.1 \times 10^{10}$ . Thus, the optical systems operated in the linear regime.

Different variations of this structure are used in all receivers presented in this thesis.

# 2.5 Summary

This chapter presented some background information that is used in the subsequent chapters of this thesis. This includes an overview of silicon photonics, a survey of optical and electronic delay lines, an overview of optical receivers with demultiplexed output, and silicon photonics split delay structure that could be used to replace clock generation in the optical receiver.

# Chapter 3

# An Optical Receiver Exploiting SiP

# Delays for Clock Phases Replacement

This chapter presents the first developed optical receiver. The receiver used the silicon photonic split and delay structure described in [18] and the background chapter to replace clock generation circuits. The work presented in this chapter has been published as a journal paper in the IEEE Transactions on Circuits and Systems II [5].

## 3.1 Introduction

In conventional optical interconnect systems, the receiver generates multiple clock phases to sample the incoming data when multiple bits are sent within each cycle of the receiver clock (Fig. 3.1a) and outputs demultiplexed streams, accordingly. Quarter-rate clocking

#### 3. An Optical Receiver Exploiting SiP Delays for Clock Phases Replacement 42

employing four clock phases is an attractive approach because it has the widest time margin and is more power efficient compared to full rate and half-rate clocking schemes [31]. While there are several ways to generate this multi-phase clock, such as LC or ring injection locked oscillators, this is becoming increasingly challenging. For example, the transmitter in [32] uses an LC phase locked loop (LC-PLL) and a quadrature generator to generate four clock phases followed by buffers for each clock phase and then a per-lane duty-cycle detection/correction (DCD/DCC) and quadrature-error detection/correction (QED/QEC) circuit to ensure proper duty cycle and spacing for all clock phases. In fact, one of the main challenges is designing quadrature phase detection/correction circuits [32]. These circuits can be removed if only a single clock phase is needed by utilizing passive delay lines instead, thus simplifying the design of the receiver. Moreover, these extra circuits will consume power as opposed to passive delay lines. In [33], the multiphase LC-ring structure that could be used to generate multiple clock phases would consume 8 mW, a non-trivial amount of power. Furthermore, 30 to 45 % of a receiver power consumption is attributed to clock generation and/or clock buffering [34–36]. In these energy-efficient optical receivers, four or two clock phases are used for clocking the comparators and for demultiplexing the output at quarter or half the data rate. These clock phases are either generated off-chip and buffered on-chip or generated and buffered on-chip. Eliminating or reducing the clock buffering in these systems could result in overall more energy efficient solutions.

#### 3. An Optical Receiver Exploiting SiP Delays for Clock Phases Replacement 43



**Fig. 3.1:** (a) Sampling in conventional optoelectronic receiver with four clock phases. (b) Sampling in the proposed system using four different delays.

In this chapter, a receiver is designed to exploit passive optical delay lines to eliminate the electronic clock generation and buffering circuits through the sampling scheme shown in Fig. 3.1b. The chapter is organized as follows: Section 3.2 to presents the overall architecture of the receiver and the circuit implementation. Section 3.3 discusses interstage AC coupling and printed circuit board (PCB) parasitics. Section 3.4 reports experimental results of the receiver fabricated in CMOS 65 nm. Section 3.5 discusses the results. Finally, section 3.6 summarises the chapter.

# 3.2 System Architecture

The receiver is designed to take advantage of a validated silicon photonic (SiP) split-delay structure shown in Fig. 3.2 and detailed in [18]. The passive optical structure splits the

incoming bit stream into four substreams. Each substream is sequentially shifted in time by one bit relative to the adjacent substream. The optical output power of each substream is approximately one-quarter of the power of the optical input data stream. This SiP structure enables each sub-receiver in the integrated circuit (IC) to recover the bits without needing the clock generation circuits found in quarter-rate receivers.



**Fig. 3.2:** The SiP chip is outlined in red while the IC is outlined in blue.

The SiP split-delay structure consists of three passive delay lines and four directional couplers. The length of each delay line corresponds to a one-bit delay. Each delay line occupies an area of  $255 \,\mu\text{m} \times 255 \,\mu\text{m}$  making possible its integration with the IC chip. This area improves at higher data rates as the bit long delay line length becomes shorter as the bit period becomes shorter. The coupling ratio of each of the four directional couplers is chosen such that the optical output power at the four outputs is the same by considering the delay line propagation loss and the splitting losses. Each of the four optical signals is

detected by a photodetector that would be connected to the input of a sub-receiver on the receiver IC.

The electronic section of the receiver consisting of four identically designed sub-receivers is fabricated in a CMOS 65 nm process. Fig. 3.3 shows a detailed circuit implementation of the sub-receiver. Each sub-receiver comprises a common-gate transimpedance amplifier (TIA) used to convert the photocurrent into a voltage and providing a low input impedance for the photocurrent. This stage is biased with an externally fed current flowing into a current mirror. To ensure the stability of the TIA (determined by its input pole) against parasitic variations at the input due to the connection to a PCB, a simple common-gate topology without feedback is designed. Following the TIA is a two-stage cascode amplifier with inductive peaking that amplifies the voltage output of the TIA to a level sufficient to drive the next stage that consists of a latch. Inductive peaking is employed in both of the voltage gain stages to enhance their bandwidth. AC coupling with a small capacitor ( $10 \, \mu m \times 10 \, \mu m$ ) of 80 fF is used to avoid additional parasitics at the input of the voltage gain stages. The latch is implemented as a single high-speed current mode logic latch (CML) that compares the amplified voltage signal to a DC reference voltage signal, labeled as Ref in Fig. 3.3. The reference voltages can be adjusted externally to account for offset in the latch and any variations in the optical delay lines, the responsivity of the photodetectors, and the CMOS process. A CML latch is used to avoid the kickback noise found in CMOS latches. CML latches also benefit from reduced voltage supply and a good common-mode rejection ratio.

The return-to-zero (RZ) output voltage levels are converted to CMOS compatible levels using a low-speed pseudo NMOS inverter, and finally to non-return-to-zero (NRZ) using a CMOS D-FF, precluding the need for a slave CML latch and reducing power consumption.



Fig. 3.3: Detailed circuit implementation of one of the four sub-receivers.

Finally, the sub-receiver output is buffered to drive the measuring equipment. Due to the SiP split-delay structure, each sub-receiver receives a delayed version of the input enabling all four incoming bits to be processed simultaneously using a single clock operating at a quarter rate of the incoming data rate.

Fig. 3.4 shows transient simulations. In this simulation, the sampling of amplified incoming data bits with a quarter-rate clock is shown. The output of the comparator is shown. This simulation shows that a single clock can be used to generate a quarter-rate output.

Moreover, the Fig. 3.5 shows the AC response of the analog front-end. The bandwidth



Fig. 3.4: Sampling of amplified bits to generate quarter rate output.

is approximately 15 GHz meaning that this receiver is expected to achieve a speed of 25 Gb/s. However, as described later in the chapter, packaging parasitics limit the speed to 12.5 Gb/s.

#### 3.3 PCB parasitics and AC coupling noise analysis

The die (Fig. 3.6) in a QFN-80 package is mounted on a high-speed low-loss RO4350B PCB for measurements. Additional parasitics associated with the PCB impact the performance of the chip. These parasitics (Fig. 3.7) are coming from the transmission line on the PCB (2 cm in this case), the package capacitance (150 fF), and the 3 mm bond wire inductance



Fig. 3.5: Ac simulations with idea photodetector.

( $\sim 3$  nH). This is compared with the conventional case where the photodetector is wire bonded directly with the IC chip with a 1 mm bond wire ( $\sim 1$  nH) by simulating the gain and the bandwidth for the first three stages for both cases.

The post layout bandwidth and gain simulations shown in Fig. 3.8 indicate a transimpedance gain of 55 dB $\Omega$  and a 3 dB bandwidth of 23 GHz with the input parasitics of a directly wirebonded photodetector. With the added input parasitics of the PCB, frequency domain ripple occurs due to the interaction between the long transmission line, the input parasitic capacitance of the package, and the wire bonding inductance. In an integrated system including SiP delay lines and an IC chip, a higher speed of operation can be expected, closer to the directly wire bonded case without PCB parasitics.



**Fig. 3.6:** A micrograph of the IC chip. The IC is wire bonded to a QFN-80 package (not shown) mounted on a high-speed low-loss RO4350B PCB.

The input impedance of the chip including the PCB parasitics looking from the transmission line end is:

$$Z_{in} = \frac{R_L - \omega^2 L C_L R_L + j\omega L}{-j\omega^3 C_{pack} L C_L R_L - \omega^2 C_{pack} L + j\omega R_L (C_{pack} + C_L) + 1}$$
(3.1)

where L = 3 nH is the parasitic inductance,  $C_{pack} = 150 \ fF$  is the package parasitic capacitance,  $C_L = 80 \ fF$  is the parasitic capacitance of the IC, mostly from the bond pad and TIA input, and  $R_L = 60 \ \Omega$  is the simulated input resistance of the TIA.

The accepted power at the receiver versus frequency (given by  $1 - |\Gamma|^2$ ), where  $\Gamma$  is the reflection coefficient) is plotted in Fig. 3.9. At low frequencies, almost all the power is delivered to the IC. However, as the frequency increases with higher transmission speeds,



**Fig. 3.7:** Model of the PCB interconnect and package between the PD and the input of each sub receiver.

the accepted power decreases due to the mismatch between the IC chip and the transmission line, limiting the bandwidth of the system at the input. In the time domain, this mismatch results in reflections that cause the eye-opening to degrade at higher speeds. Thus, ideally, bond wire lengths should be shorter than 1 mm for optimal operation and to avoid issues such as ringing at high-speed or reflections at the input of the receiver.

AC coupling is utilized here to allow for more convenient biasing of the receiver. From Fig. 3.3, the gain of the transimpedance gain of the first three stages including AC coupling is given by:

$$\frac{V_{out}}{I_{in}} = R_S \cdot \left(\frac{sC_C R_L}{sC_C R_L + 1} \cdot gm_{m4} \cdot r_{o4} \cdot gm_{m5} \cdot (R_V + sL)\right)^2$$
(3.2)

where  $gm_{m4}$  and  $gm_{m5}$ ,  $r_{o4}$  and  $r_{o5}$  are the transconductances and output resistances



**Fig. 3.8:** AC simulations comparing the performance of the IC chip when wire bonded to a PD compared to the IC mounted on a PCB.

of transistors m4 and m5, respectively. At frequencies above the low cut-off frequency of the high-pass filters (where  $sC_CR_L + 1 >> 1$ ), the gain becomes similar to that of a DC coupled system. Since the cascode stages are AC coupled with a low cut-off frequency of 0.5 MHz formed by the 3.8 M $\Omega$  resistor and the 80 fF capacitor, a pseudo-random bit sequence (PRBS 7) is selected for the measurements.

The noise performance due to AC coupling is dominated by  $R_S$  and  $R_L$ , and the coupling capacitor  $C_C$ . The total voltage noise power due to  $R_L$  is approximated to be:

$$\overline{v_{n_{RL}}^2} \approx 4kT \cdot \frac{R_s^2}{R_L^2} \cdot \frac{1}{2\pi C_C} tan^{-1} (2\pi f C_C R_L)$$
(3.3)

As can be seen in 3.3, since  $R_L >> R_S$ , the noise is small within the bandwidth of the system. It can also be shown that for  $R_L >> R_S$ , the voltage noise power spectral density



Fig. 3.9: The fraction of accepted power versus frequency due to parasitics at the input. due to  $R_S$  is approximately:

$$\frac{\overline{v_{n_{RL}}^2}}{\Delta f} \approx \frac{4kT}{R_S} \tag{3.4}$$

This is similar to the noise power spectral density of  $R_S$  in the case of a DC coupled system. Therefore, it is concluded that AC coupling does not impact the gain nor the noise performance of the system compared to a DC coupled system.

### 3.4 Experimental validation

The receiver is validated through optical measurements using an off-chip photodetector module (DSC10H from manufacturer Discovery Semiconductors) with a reported typical

responsivity of 0.6 A/W at 1550 nm. The measurement setup is shown in Fig. 3.10. The continuous (CW) light from the laser is connected to a polarization controller (PC) and then modulated using a 12.5 Gb/s Mach–Zehnder Modulator (MZM) with a PRBS 7 sequence generated by a programmable pattern generator (PPG). The output of the MZM is then connected to an erbium doped fiber amplifier (EDFA) and then filtered. A variable optical attenuator (VOA) controls the optical power at the photodetector. The output of the VOA is connected to an optical delay line followed by a 10/90 coupler to monitor the optical power with a meter. The optical delay line emulates the operation of the SiP chip. The clock sampling phase is adjusted manually, but an electronically tunable optical delay [29] can be used to make this adjustment, removing the need for a clock and data recovery circuit or a PLL. For commercial high-volume production, an additional circuit to align the single clock phase with data would be needed. This circuit could be a simple delay-locked loop that is used to delay the clock phase. The error detector is used to measure the bit error rate (BER) versus average optical power, which is controlled by the VOA. The measurement is repeated for each of the four sub-receivers. The delay of the tunable delay line is increased by one bit for each subsequent sub-receiver. Cables, PCB, and bond wires losses are de-embedded.

Fig. 3.11 shows the measured BER curves and the output eye for all sub-receivers. Each sub-receiver output is operating at a quarter of the 12.5 Gb/s data rate, i.e., 3.125 Gb/s, in this validation. The output eye diagrams show dual-rail levels due to the



**Fig. 3.10:** Test setup for the IC in the receiver using an external photodetector module. The dashed outline represents the components used to emulate the SiP chip. Dotted lines represent the change of the connection between the PD and the input of the IC for testing of each of the four channels.

impedance mismatch between the output driver, PCB transmission lines, cables, and the measurement equipment. However, the eye remains sufficiently open for the bit error tester to make an accurate decision. The system achieves a BER of  $10^{-12}$  for an input power of approximately -4 dBm for sub-receivers Rx1, Rx2, and Rx3. Sub-receiver Rx4 exhibits poor BER performance due to a longer connection on the IC between the bond pad and the input of the TIA. This connection is outlined in Fig. 3.6 and stems from the limited

chip area available. This connection adds parasitic capacitance to ground, and series resistance and inductance that degrade performance significantly as indicated by the corresponding BER curve. An investigation through simulations with these parasitics corroborates this hypothesis. Performance consistent with sub-receivers Rx1, Rx2 and Rx3 can be expected from sub-receiver Rx4 in an optimized layout. The performance variations between the Rx1, Rx2, Rx3 are due to asymmetries between the global pad connections to different sub receivers, and to process, voltage and temperature variations.



Fig. 3.11: The measured BER curve for each of the sub receivers with the four output eye diagrams with a data rate of 3.125 Gb/s. Output eye diagrams are shown for  $10^{-12}$  for Rx1-Rx3 and approximately  $10^{-6}$  for Rx4.

Variations in the optical delay lines can occur due to the SiP fabrication process. In [18],

the measured delay variations are up to 3 ps and up to 7 ps for the 20 Gb/s and 10 Gb/s SiP split and delay structures, respectively. To test the robustness of the electronic receiver to such variations, BER measurements with respect to a phase offset between the data and the clock is performed (Fig. 3.12)). The receiver shows a tolerance of approximately 0.2 UI (i.e., 16 ps) to a phase offset at 12.5 Gb/s at a BER of 10<sup>-12</sup>. The receiver is thus robust to possible variations in the optical delay lines. Multiple channels of the receiver were also concurrently tested to verify that the data is well-recovered sequentially. No performance degradation was caused by inter-channel crosstalk, as there was sufficient spacing between the four sub-receivers in the chip layout.



Fig. 3.12: Bathtub curve measurements of the receiver for a 12.5 Gb/s input.

The power consumption breakdown of the IC is shown in Fig. 3.13. The single-phase

clock buffering represents only 9.9 % (i.e., 6.36 mW) of the total power consumption. The total power consumption for all channels excluding the output buffers is 24.44 mW. The resulting energy efficiency at 12.5 Gb/s is 1.93 pJ/bit including clock buffering but excluding the output buffers. This energy efficiency can be further improved by increasing the data rate of the receiver through wire bonding a 4-channel photodetector array to the input of the receiver instead of the PCB connection, which is speed limited by input parasitics. Moreover, it is possible to use an integrating low-bandwidth front-end such as the one proposed in [18] that consumes less power leading to better energy efficiency. The receiver can also operate at a voltage supply lower than 1.0 V, down to 0.82 V. At 0.82 V the energy efficiency is 1.8 pJ/bit.



Fig. 3.13: Power consumption breakdown.

#### 3.5 Discussion

The proposed technique exhibits two trade-offs. First, the splitting of the signal and the delay lines degrades the sensitivity of the receiver by approximately 7 dB due to the splitting and propagation losses. This can be compensated by utilizing forward error correction (FEC) codes [37]. For example, the RS(255,239) super FEC coding scheme provides 7.95 dB of net coding gain improving the BER from  $5.8 \times 10^{-3}$  (threshold) to  $10^{-12}$ . Second, the receiver needs to operate at a speed set by the optical delay lines. Tunable silicon photonics delay lines (e.g. [29]) that can have a continuous delay range of up to 1 ns can be used to replace the fixed delay lines in the present SiP structure allowing for variable data rates. Since the impact of clocking on the overall power consumption is continuing to increase linearly as the operating speed increases for a given technology node, the concept presented here provides a compelling advantage in power savings related to clocking.

Table 3.1 shows a summary of the performance along with other state-of-the-art receivers from the literature. As a result of utilizing optical delay lines, only one clock phase needs to be applied externally, and the second phase is generated using a minimum size inverter (for latch operation). Accordingly, the power consumption of the buffer of the external clock phase is only approximately 9.9 % of the receiver power consumption, which compares favorably to other works (in similar technology nodes) that exhibit values beyond 29 %. As such, the measured 6.36 mW power consumption of the clock generation and buffering block is lower than that reported in [34] that consumes 18 mW for four clock phases distribution.

The receiver is also more efficient than [38] which uses a passive poly-phase filter to generate a 4-phase clock followed by phase-error corrector and phase interpolator. While [35, 36] achieve better power consumption for the clocking blocks, these receivers do not generate clock phases on-chip and the necessary clock phases are provided externally to the chip. The proposed system benefits from a wide time margin and the energy efficiency of a quarter-rate clocking system, while not requiring additional circuits to correct for duty cycle and quadrature errors.

It should be noted that optical delay lines are only used to replace clock phase generation circuit, and the power saving claimed in this prototype only includes the removal of clock phase generation circuits. In a complete system where a phase interpolator/rotator is used to align the clock phase with data, the optical delay lines only serve to simplify the clock phases generation blocks. This is because using a phase shifter/interpolator implies clock phase generation. This leads to a lower power saving than what is claimed in this implementation. This prototype demonstrates that optical delay lines can be used to simplify or eliminate clock phase generation circuits specifically and that there are potential power savings in doing so.

#### 3.6 Conclusion

This chapter presented a 12.5 Gb/s optoelectronic receiver in 65 nm CMOS that employs SiP delay lines to eliminate clock generation circuits and the associated buffers. The receiver

Table 3.1: Performance summary and comparison

| This<br>work      | [34]                                                      | $[35]^2$                                                                                                                                                                                                                                                    | $[36]^3$                                                                                                                                                                                                                     | [38]                                                                                                                                                                                                                                                         |
|-------------------|-----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 65                | 40                                                        | 65                                                                                                                                                                                                                                                          | 65                                                                                                                                                                                                                           | 65                                                                                                                                                                                                                                                           |
| 12.5              | 25                                                        | 24                                                                                                                                                                                                                                                          | 20                                                                                                                                                                                                                           | 21.2                                                                                                                                                                                                                                                         |
| 1.93<br>(1.8@0.8\ | (7) 1.72 <sup>4</sup>                                     | 0.4                                                                                                                                                                                                                                                         | 0.7                                                                                                                                                                                                                          | $5.5^{4}$                                                                                                                                                                                                                                                    |
| $-4.0^{1}$        | -8.7                                                      | -4.7                                                                                                                                                                                                                                                        | -5.8                                                                                                                                                                                                                         | -7.5                                                                                                                                                                                                                                                         |
|                   |                                                           |                                                                                                                                                                                                                                                             |                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                              |
| 6.4               | 18                                                        | 4.3                                                                                                                                                                                                                                                         | 6.6                                                                                                                                                                                                                          |                                                                                                                                                                                                                                                              |
| (9.8 %)           | $(42 \%)^5$                                               | (45 %)                                                                                                                                                                                                                                                      | (46 %)                                                                                                                                                                                                                       | _                                                                                                                                                                                                                                                            |
|                   |                                                           |                                                                                                                                                                                                                                                             |                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                              |
| PRBS 7            | PRBS<br>31                                                | PRBS 7,9,15                                                                                                                                                                                                                                                 | PRBS 7                                                                                                                                                                                                                       | PRBS 7, 31                                                                                                                                                                                                                                                   |
|                   | work 65 12.5 1.93 (1.8@0.8V -4.0 <sup>1</sup> 6.4 (9.8 %) | $ \begin{array}{c cccc}  & & & & & & & & \\ \hline  & 65 & 40 & & & \\  & 12.5 & 25 & & \\  & 1.93 & & & & \\  & (1.8@0.8V) & & & & \\  & -4.0^1 & -8.7 & & \\ \hline  & 6.4 & 18 & & \\  & (9.8\%) & (42\%)^5 & & \\ \hline  & PRBS 7 & PRBS \end{array} $ | work     [34]     [35]²       65     40     65       12.5     25     24       1.93 $1.72^4$ 0.4       -4.0¹     -8.7     -4.7       6.4     18     4.3       (9.8 %)     (42 %)⁵     (45 %)       PRBS       PRBS       PRBS | work $[34]$ $[35]^2$ $[36]^3$ 65     40     65     65       12.5     25     24     20       1.93 $(1.8@0.8V)$ $1.72^4$ 0.4     0.7 $-4.0^1$ $-8.7$ $-4.7$ $-5.8$ 6.4     18     4.3     6.6 $(9.8\%)$ $(42\%)^5$ $(45\%)$ $(46\%)$ PRBS PRBS       PRBS PRBS |

<sup>&</sup>lt;sup>1</sup>Does not include losses of the SiP structure. A 7 dB insertion loss is expected.

was validated experimentally through electronic and optical testing. The receiver achieves a sensitivity of -4 dBm at a BER of  $10^{-12}$  while exhibiting an energy efficiency of 1.93 pJ/bit.

A large part of the power consumption of conventional receivers is due to clock generation and clock buffering. This technique has the potential of improving energy efficiency and removing complex circuits in quarter-rate systems.

<sup>&</sup>lt;sup>2</sup>Estimate based on the reported power consumption breakdown. Does not include clock generation and SR latch.

 $<sup>^3</sup>$ Two clock phases. Clocks generated using an off chip directional coupler.

<sup>&</sup>lt;sup>4</sup> Energy efficiency of the receiver including a clock and data recovery block.

<sup>&</sup>lt;sup>5</sup>Calculated based on the power consumption of the receiver and the clock distribution network.

### Chapter 4

# A 17 Gbps 156 fJ/bit Two-Channel Optical Receiver in 65 nm CMOS

This chapter presents a novel energy-efficient 17 Gbps two-channel optical receiver architecture. The two-channel architecture improves upon the receiver presented in the previous chapter as it reduces insertion loss and improves energy efficiency. The work presented in this chapter has been published as a journal paper in the IEEE Transactions on Circuits and Systems I [6].

#### 4.1 Two-channel optical receivers overview

#### 4.1.1 Two-phase clocked receiver

Fig. 4.1a illustrates a conventional two-channel receiver architecture. Here the optical input is applied to a photodetector (PD) and the resulting photocurrent is then passed to a TIA. The output of the TIA is then split into channel 1 and channel 2. To digitize the TIA output signal, this signal is compared with a reference signal at  $f_S/2$  clock speed in each channel. The final output results from serializing the outputs of the two channels. As shown, this architecture requires two clock phases at  $f_S/2$ : the clock and the clock signal shifted in phase by  $180^{\circ}$  (labelled Clock  $180^{\circ}$  in Fig. 4.1a and Fig. 4.1b). This architecture exhibits crosstalk between the paths resulting in inter-symbol interference, as well as the clock feedthrough from one channel clock on the other channel signal at the output of the TIA. Thus, before splitting the signal into two paths, the TIA is usually followed by gain stages to improve the signal-to-noise ratio of the receiver.

To mitigate the crosstalk and clock-feedthrough noise, Fig. 4.1b shows an architecture that splits the input signal earlier in the signal paths, i.e., before the PD. The TIA gain requirement can be relaxed, but two PDs and two TIAs are needed. This architecture also requires two clock phases at  $f_S/2$ : Clock and  $Clock_-180^o$  phases. Here, the optical input is divided into two identical optical paths and then applied to two PDs. Consequently, this architecture requires at least 3 dB more optical input power to compensate for the optical

splitter. For instance, if four paths are used, like in the previous chapter, then a 4-phase clocking must be adopted. One of the trade-offs in that case is that the optical input must be split into four paths, resulting in a theoretical 6 dB optical insertion loss.

#### 4.1.2 Two channel optical split and delay receiver

In this two-channel optical receiver, instead of delaying the phase of the main clock by  $180^{\circ}$  and passing it to the second channel, its optical input can be delayed. This concept simplifies the receiver by removing the need for the  $Clock\_180^{\circ}$  phase. As illustrated in Fig. 4.1c, the input can be passively delayed in its optical form, before converting it to an electrical signal in the PD. To implement this concept, the optical signal should be split into two optical signals. One signal is directly passed to the PD of channel 1, and the second signal is delayed by one bit period  $(T_D)$  and passed to the PD of channel 2. The development of silicon photonics (SiP), which enables the fabrication of optical circuits with the mass production tools developed for CMOS circuits, makes the implementation of simple processing functions in the optical domain straightforward and economically viable [18,39]. The main drawback of this concept is that the optical power received by each PD is at least 3 dB less than the total power at the input of the receiver, which will reduce the sensitivity of the optical receiver and can limit its reach.

The proposed architecture has the following advantages over the structure presented in Fig. 4.1b. It only requires one clock phase to sample the signal in the comparator of both



Fig. 4.1: (a) A conventional two-channel receiver architecture that splits the paths after the TIA, and requires the Clock and  $Clock_{-}180^{\circ}$  (i.e., the clock signal that is shifted by  $180^{\circ}$ ) phases; (b) a two-channel receiver architecture that splits the paths before the PD, and requires the Clock and  $Clock_{-}180^{\circ}$  phases; (c) the proposed two-channel receiver architecture that splits the paths before the PD and only requires one clock phase for its comparators.

channels, such that  $Clock\_180^{\circ}$  is not required. It should be noted that  $Clock\_180^{\circ}$  is usually available in a receiver since it can be generated by inverting the main clock to generate  $\overline{Clock}$ . However, generating a  $\overline{Clock}$  signal that is exactly  $180^{\circ}$  phase shifted from the clock signal requires additional circuitry. Indeed, this additional circuitry is needed to adjust the phase difference between Clock and  $\overline{Clock}$  to be exactly  $180^{\circ}$  and ensure that the duty cycles of both Clock and  $\overline{Clock}$  are precisely 50 %. The architecture proposed in Fig. 4.1c does not require such an accurate clock, as it will be explained in next section. Another advantage of this structure is that the duty cycle of the clock can be tuned to improve the comparator performance, as validated by the measurement results presented in the next section.

The area / cost overhead of this architecture requiring additional optical elements is minimal, especially when implemented with silicon photonics. The split-delay structure, shown in Fig. 4.2, can be built using a directional coupler followed by a delay line. The coupling ratio of the directional coupler can be adjusted to compensate for the optical propagation loss in the delay line such that the power at each PD is the same. The delay line loss for a silicon on insulator optical waveguide with a cross-section of 220 nm  $\times$  3  $\mu$ m ranges between 0.1 and 0.2 dB/cm [18]. As a result, the coupling ratio needed is r = 49/51. The benefit of carefully tuning the coupling ratio is the elimination of the need for gain control stages in the TIA because of different optical power at the photodetectors. As such, the two TIA stages in each sub receiver can be identical. Another benefit of the integration of photonic elements is their compact size and low cost since this technology leverages the

infrastructure of existing CMOS foundries. For example, a 50 ps (20 Gb/s) delay line along with the directional couplers and the photodiodes occupy only  $0.43 \text{ }mm^2$  [18] on the SiP die. The cost of this process per fabrication area is below that of modern CMOS processes, since the latter require several small critical dimensions masks. Note that for higher transmission speeds, the cost to fabricate the SiP chips is even less since the delay lines needed are shorter [20]. The delay line length can be finely controlled to achieve accurate delays. In [18], the delay offset between the fabricated devices and the design value is approximately 3 ps at 20 Gb/s. It is possible to use electronically tunable optical delay lines such as [29, 40] that can provide tunable delays of up to 1 ns. The temperature dependency of the delay in the silicon-made delay lines is only 0.01 % per Celsius [41]. At 17 Gbps input with a required delay of 59 ps, the timing delay change due to a shift in temperature of  $100 \, ^{\circ}C$  is only 0.6 ps. The devices presented in [18] were designed for 20 Gb/s links and not for 17 Gbps. Consequently, a discrete optical splitter and a mechanically-tunable optical delay-line were used instead in this receiver.

# 4.2 Design of the two-channel electronic receiver with optical-input split and delay

In this section, a two-channel optical receiver with an optical input split and delay structure prior to photodetection, shown in Fig. 4.3, is implemented. The proposed integration here



**Fig. 4.2:** Silicon photonics (SiP) split-delay structure schematic with envisioned integration with the electronic IC chip.

is a hybrid integration between the silicon photonics process and the CMOS process. This section details the electronic design of the receiver.

#### 4.2.1 Electronic receiver architecture

The receiver of each channel is connected to a PD and consists of a TIA, a comparator, and a latch, as shown in Fig. 4.3. The latch output is then passed to a current-mode buffer (output driver in Fig. 4.3) to transmit the output bits off-chip. Since the output swing of the buffer is not large enough to drive the input of the error detector (ED) of the bit-error-rate tester (BERT), an external high-bandwidth amplifier is used between the chip output and the ED input. This amplifier does not impact the performance of the chip as it is only used



Fig. 4.3: The system-level details of the implemented two-channel receiver.

to amplify the digital output of the chip.

As mentioned, the proposed architecture reduces crosstalk and clock-feedthrough by splitting the signal in the optical domain and thus relaxing the SNR requirements of the receiver. Hence, in each channel, only one TIA without additional gain stages is used to amplify the signal before the comparator, resulting in a substantial reduction of the total power consumption. Gain improvement is also proposed for the TIA to partially compensate for the lack of multiple cascaded gain stages. Furthermore, a dynamic comparator and latch are used instead of a static counterpart to significantly decrease the total power consumption.

## 4.2.2 Transimpedance amplifier with single-ended-input and differential output

Fig. 4.4 shows the TIA and its connections to the PD and the comparator. Here, the PD is modeled as a current source with a parallel capacitance  $C_{PD}$ , representing the junction capacitance of the photodiode, and a series resistance  $R_{PD}$ . Moreover,  $L_B$ ,  $C_P$ , and  $C_L$  are the bondwire inductance, pad capacitance, and TIA load capacitance, respectively. The TIA consists of an inverter as an amplifier [42] that has a resistor in series with an inductor in its feedback. The inductor is used to improve the TIA bandwidth by introducing a zero in the TIA transfer function. Resistor  $R_{LF}$  is used to damp the high frequency peaking induced by the inductor  $L_F$ . It is also possible to reduce  $L_F$  without including  $R_{LF}$  to reduce excessive peaking. The benefit is a reduced thermal noise compared to the current implementation. In the current implementation, the difference between  $V_{OUT}$  and  $V_{IN}$  is passed to the comparator to make the bit decision. Although these two signals are not differential, they have different polarities and hence, here, they are named pseudo-differential signals. The resulting input-output transfer function of this TIA, when only  $V_{OUT}$  is used as the output, has a low-frequency gain of:

$$\left| \frac{V_{OUT}}{I_{IN}} \right| = \frac{R_S - \frac{1}{g_m}}{1 + \frac{1}{1 + g_m R_L}} \tag{4.1}$$

where

$$g_m = g_{m1} + g_{m2}$$
, and  $R_L = r_{o1} || r_{o2}$  (4.2)

Here,  $R_F$ ,  $g_{m1}$  and  $g_{m2}$  are the feedback resistor, the transconductance of transistor M1 and the transconductance of transistor M2, in Fig. 4.4, respectively.  $r_{o1}$  and  $r_{o2}$  are the output resistances of M1 and M2.



Fig. 4.4: The TIA circuit and its connections to the PD and comparator.

The resulting input-output transfer function of this TIA, when  $V_{OUT} - V_{IN}$  is used as the output, has a low-frequency gain of  $R_F$ :

$$\left| \frac{V_{OUT} - V_{IN}}{I_{IN}} \right| = R_F \tag{4.3}$$

Thus, the pseudo-differential output shows a higher low frequency gain. By writing the KCL equations for the circuit in Fig. 4.4, it can be shown that both transfer functions

have five poles and two zeros. The two zeroes of the input-output transfer function (i.e.,  $V_{OUT} / I_{IN}$ ) are:

$$\omega_{Z1a} = \frac{R_F || R_{LF}}{LF} , \ \omega_{Z2a} = \frac{g_m}{C_L}$$
 (4.4)

The two zeroes of the input-output transfer function when  $(V_{OUT} - V_{IN})$  is used as the output instead of  $V_{OUT}$  (i.e., $(V_{OUT} - V_{IN})/I_{IN}$ ) are:

$$\omega_{Z1b} = \frac{R_F || R_{LF}}{LF} , \ \omega_{Z2b} = \frac{g_m}{C_{GD}}$$
 (4.5)

where  $C_{GD}$  is the sum of the gate-drain capacitance of both NMOS and PMOS transistors. The second zero of both transfer functions is at very high frequencies, and the first zero can be used to extend the bandwidth of the TIA. Assuming that the denominator of both transfer functions can be simplified as:

$$Den(s) = 1 + as + bs^{2} + cs^{3} + ds^{4} + es^{5}$$
(4.6)

and considering a dominant-pole transfer function, where the first pole can be approximated by 1/a, both transfer functions have the same first pole. Thus, the main advantage of using pseudo-differential output signaling is that it achieves higher gain without a detrimental effect on the frequency behavior of the TIA.

In this design, the TIA bandwidth is set to 24 GHz, which is intentionally higher than

the input data rate (17 Gbps) such that the signal coming from the PD is not limited by the bandwidth of the TIA. The bandwidth was overdesigned in order the compensate for any degradation in bandwidth post-fabrication, and also in an attempt to achieve the highest speed possible. However, the speed of the receiver is limited by the speed of the comparators. This bandwidth overdesign will cause more noise to be integrated within the bandwidth of the receiver and reduces the maximum possible gain. Thus, ideally, the bandwidth should be chosen to be roughly  $0.7 \times$  data rate for optimal gain and noise performance.

Fig. 4.5a shows the bode diagram of the transfer functions of (4.1) and (4.3) using the component values listed in Table 4.1. As shown, the TIA gain is improved by 1.9 dB with the pseudo-differential signaling whereas the TIA bandwidth is reduced by only 5 %. This additional gain relaxes the need for additional signal amplification before the comparator. Fig. 4.5b shows the effect of the damping resistor  $R_{LF}$  on the gain bode diagram. If no damping resistor is used, the substantial peaking of the gain at high frequencies results in a noticeable ringing of the output pulse response of the TIA. As shown in Fig. 4.5b, this peaking is removed by using the resistor  $R_{LF}$ .

**Table 4.1:** Component values used in Fig. 4.4 and to plot Fig. 4.5

| $\overline{C_{PD}}$ | 80 fF | $C_P$    | 80 <i>fF</i> | RF                | $320\Omega$ |
|---------------------|-------|----------|--------------|-------------------|-------------|
| $R_{PD}$            | 80 Ω  | $L_F$    | 3 nH         | $g_{m1} + g_{m2}$ | 20ms        |
| $L_B$               | 1 nH  | $R_{LF}$ | $1 k\Omega$  | $C_L$             | 15 fF       |



Fig. 4.5: The bode diagram for (a)  $(V_{OUT} - V_{IN})/I_{IN}$  and  $V_{OUT} / I_{IN}$ ; (b)  $(V_{OUT} - V_{IN}) / I_{IN}$  with and without the damping resistor  $R_{LF}$ .

#### 4.2.3 High-speed comparator with offset nulling and latch

Fig. 4.6 shows the dynamic comparator and latch. Only one of the two latches is shown for clarity. As compared to static comparators, dynamic comparators have a lower power consumption but they suffer from kickback error and feedthrough of the clock. These two effects can generate an offset at the input of the comparator as well as noise. However, the splitting at the input relaxes the kickback from the dynamic comparator to its inputs.



Fig. 4.6: The dynamic comparator and latch with offset-nulling signals  $V_{BP}$  and  $V_{BN}$ .

As mentioned, the input to the comparator is a pseudo-differential signal and, hence, is self-referenced. Thus, the comparator does not need to have a reference voltage at its input. To reduce the loading effect of the comparator, the input transistors are small and can have a noticeable offset. To compensate for this offset as well as the offset due to the kickback error and clock feedthrough, the bias voltages of the bulk of the input transistors (i.e.,  $V_{BP}$ )

and  $V_{BN}$  in Fig. 4.6) are controlled off-chip to adjust their threshold voltages [43]. In the experimental test setup, the  $V_{BN}$  of both channels are connected to a similar bias voltage (0.3 V), and only the  $V_{BP}$  of each channel is tuned. Thus, in total, only one bias voltage per channel needs to be tuned to compensate for the offset error. This offset cancellation is also used to cancel the DC component of the photocurrent.

As shown in Fig. 4.6, the comparator only requires one clock phase. The proposed architecture has the advantage of not requiring  $\overline{Clock}$  for its comparators, as both channels can be clocked using the same phase. Moreover, to improve the comparator performance, the duty cycle of this clock can be tuned. For instance, in this design, due to the low mobility of the PMOS transistors, the PMOS transistors (MP) that are used to reset the comparator outputs should be large enough to do their function. By increasing the off-time of the clock signal, they have more time to reset the output, and hence their size can be reduced. Thus, the comparator speed can be improved when considering that the duty cycle is also a design parameter. Furthermore, at the same speed, it is possible to obtain a better signal detection due to the improved output resetting of the comparator. In this design, by using a clock that has a 45 % duty cycle (i.e., 55 % off-time), the measured input optical modulation amplitude (OMA) sensitivity of the receiver is improved by 1.1 dB from -5.9 dBm to -7 dBm. Adjustable duty-cycle circuits [44, 45] can be used to realize such a duty cycle.

The latch is implemented using a simple transmission gate (TGATE) switch that consists of both NMOS and PMOS switches. It is only ON during the ON-time of the comparator,

when the outputs are valid. To compensate for the comparator delay, the latch clock signal  $CLK_D$  is delayed accordingly. The PMOS and the NMOS transistors of the switch are sized through simulations such that the charge injection and clock feedthrough of the switch are minimized. The latch requires the inverse of the input clock (Fig. 4.6). Here  $\overline{CLK_D}$ , is a delayed version of the  $\overline{CLK}$  signal. Although  $\overline{CLK_D}$  could be implemented on-chip, here, it is provided off-chip to study its accuracy requirement. In the measurement, it is observed that  $\overline{CLK_D}$  does not need to be precise in time and can have up to 10 ps of delay as compared to a true 180° phase clock, without affecting the system performance. Such an inaccuracy in the inverted clock of a conventional receiver when it is used as the 180° phase clock leads to bit errors since this clock samples the second channel and the eye width of the signal at the input of the comparator is limited. For instance, at 8.5 Gbps, a 10 ps timing error corresponds to a 0.09 UI reduction in the width of the eye diagram opening of a typical two channel receiver. Note that a 10 ps timing error in 65 nm CMOS technology, with a typical digital-gates rise/fall time of 20 ps to 30 ps, is a relatively small value. For two chains of only three inverters with aspect ratios of 8 and 16 for the NMOS and PMOS transistors of all inverters, respectively, the delay difference between the two chains can vary by 4.6 ps  $(3 \sigma_{inv})$ . Usually, the clock path requires a longer chain to distribute the clock signals at high frequencies, and hence, can have a timing error larger than 4.6 ps. Back-to-back inverters or bigger inverters can be used to improve matching but they increase power consumption. Moreover, back-to-back inverters are not very effective for small delays, due to the limited rise / fall times of the inverters.

The insensitivity of the proposed receiver to the inverted clock timing error is mainly due to the fact that the TGATE switch also has an NMOS transistor to pass the signal at the right time. Moreover, the TGATE clock signals are always designed to tolerate some timing error, i.e., here, they turn off the TGATE 10 ps before the signal at the input of the TGATE resets. Also, the circuit utilized to generate  $\overline{CLK_D}$  has only 3 ps (3  $\sigma_{opt}$ ) of delay variation due to mismatches, providing sufficient margin for correct operation of the receiver.

#### 4.3 Measurement results

The proposed two-channel receiver was implemented in a 65 nm CMOS technology and mounted in a QFN80 package. To emulate the optical splitter, the delay, and the PD functionality presented in [18, 39], a discrete optical fiber splitter and a mechanically tunable optical delay-line are used to generate two optical signals, where one signal is the delayed version of the other signal. Then, both of these optical signals are coupled to two photodetectors. The 30 GHz InGaAs photodetectors from Global Communication Semiconductors (P/N: DO309\_20um\_C3) have a responsivity of 0.7 A/W. The photodetectors are mounted in the QFN80 package next to the receiver die. The photodetectors are bonded to the receiver inputs using bondwires with a length of 1 mm. Their estimated inductance of 1 nH matches the model in Fig. 4.4. Fig. 4.7 shows a micrograph of the electronic receiver chip and the connections of the photodetectors to the

receiver inputs. The CMOS chip active area is 300  $\mu$ m  $\times$  300  $\mu$ m per channel.



**Fig. 4.7:** Packaged chip micrograph of the receiver and its connections to photodetectors with 1 mm bondwires.

Both channels use the same clock signal. Careful symmetric layout techniques ensure that the off-chip clock is distributed similarly to both channels. The optical delay line is manually tuned to generate the required delay of TD. An integrated splitter and delay in SiP were demonstrated in [18,39]. Tunable photonic delay lines are also available and can generate a wide range of delay relaxing the accuracy necessary in the fabrication of fixed delay lines [29,40]. It should be noted that tuning the delay is only required if the delay error is comparable to the width of the eye opening. In such a case, the power consumption of the optical delay must be included in the total power consumption.

Figure 4.8 illustrates the experimental test setup. Here, the 1550 nm light from the laser is coupled to a fiber that is connected to a polarization controller and then is modulated with an electrical 17 Gbps PRBS 10 signal. Then, the modulated optical signal is passed to the optical splitter with a measured insertion loss of 3.3 dB. For the measurements, we use offchip components such as free space delay lines and optical splitters to emulate the operation of the integrated silicon photonic chip. Since the data rate is 17 Gbps, the clock frequency for both receiver channels is 8.5 GHz. Fig. 4.9 shows the bit error rate (BER) versus the input optical signal power of the two-channel receiver before the splitter, at 17 Gbps. This BER measurement is performed using a Centellax TG1B1-A BERT. As shown, to achieve a BER of  $10^{-12}$ , the optical input sensitivity of the receiver is -7 dBm OMA. This sensitivity is achieved without using any equalization technique. Implementing an equalization technique, such as DFE (Decision Feedback Equalization), would improve the sensitivity [36]. We believe that the sensitivity of this receiver was mainly limited by the voltage swing required at the input of the comparator rather than thermal noise. This is because there is one gain stage (the TIA). This means that the input signal needs to be sufficiently large to drive the comparator stage. The design choice of having only one gain stage was made to attempt to improve energy efficiency. The cost was a slight degradation in sensitivity. Optimal energy efficiency is targeted and thus no equalization is implemented here. Note that the input sensitivity of each path is 3.3 dB lower than the sensitivity of the full receiver. Fig. 4.10 shows the bathtub curve of the receiver for an OMA input of -6 dBm with respect to the sampling clock. The receiver tolerates up to  $105^{\circ}$  of eye opening (equals to 0.3 UI) at a BER of  $10^{-12}$ .



Fig. 4.8: Experimental test setup used to validate the optical receiver.

Fig. 4.11 shows the eye diagram of the signal at the output of the optical modulator along with one at the output of the receiver. Since the output signal amplitude (25 mVpp) is smaller than the input sensitivity of the error detector (ED), which is 100 mVpp in this case, it is amplified using a wideband amplifier with a 20 dB gain before the ED. The ringing on the eye diagram is due to the fact that the output is single-ended with a low amplitude and that there is a few millivolts of clock feedthrough through the PCB or package bondwires.

In each channel, the TIA power consumption is 0.95 mW, and the total power consumption of the comparator, the latch, and the clock distribution is 0.38 mW. Thus, for the full two-channel receiver running at 17 Gbps, the power consumption is 2.66 mW, resulting in a power efficiency of 156 fJ/bit for a BER of 10<sup>-12</sup>. Here, the power consumption by the output drivers is excluded. Table 4.2 compares this receiver with the



**Fig. 4.9:** Bit error rate (BER) for a 17 Gbps PRBS 10 optical input signal of the full receiver versus the input OMA at the input of the splitter and considering its 3.3 dB loss for the splitter.



Fig. 4.10: Full receiver bathtub curve at a 17 Gbps input.



Fig. 4.11: The eye diagram of the input and output signals. Since the output signal is single-ended and has a small amplitude, it is slightly distorted by some common-mode noise.

state-of-the-art. Overall, this novel receiver has a superior power efficiency as compared to the ones previously reported in the literature [3, 35, 36, 46–48]. Whereas [48] achieves a similar energy efficiency, it is implemented in a smaller technology node that contributes to lowering the dynamic power consumption of the clock generation blocks. The receiver presented here also achieves good sensitivity despite the 3.3 dB splitting. This highlights the feasibility of this proposed receiver approach. An implementation in a more advanced technology node would allow for higher speed leading to better energy efficiency.

|                              | This<br>work | [47]  | [35] | [36]      | [46]       | [48]  | [3]        |
|------------------------------|--------------|-------|------|-----------|------------|-------|------------|
| CMOS node (nm)               | 65           | 40    | 65   | 65        | 90         | 28    | 65         |
| Data-rate (Gb/s)             | 17<br>-7.0   | 25    | 24   | 20        | 16         | 25    | 25         |
| Sensitivity (dBm)            | OMA          | -10.8 | -4.7 | -5<br>OMA | -5.4       | -14.9 | -8         |
| Power consumption $(mW)$     | 2.66         | 27.6  | 9.6  | 14.2      | 23         | 4.25  | 17         |
| Energy efficiency $(pJ/bit)$ | 0.156        | 1.13  | 10.4 | 0.71      | $^{2}1.43$ | 0.17  | $^{3}0.68$ |

**Table 4.2:** Performance summary and comparison.

#### 4.4 Conclusion

A 17 Gbps two-channel optical receiver with an energy consumption of 156 fJ/bit was presented. The combination of a simplified clocking, signal amplification with only one gain improved TIA, as well as a dynamic comparator results in superior energy efficiency. The full receiver was implemented in 65 nm CMOS. The receiver die and the photodetectors were mounted in a QFN80 package and connected together using bondwires. An input sensitivity of -7dBm OMA was achieved for this receiver without using any equalization technique. Superior energy efficiency was achieved and successful receiver using two clock phases was demonstrated.

This architecture is suitable for integration with SiP circuits which can be used to achieved the required optical function at the input, allowing for a high degree of

<sup>&</sup>lt;sup>1</sup>Without clock generation and SR latch.

<sup>&</sup>lt;sup>2</sup>Power consumption of the front-end only.

<sup>&</sup>lt;sup>3</sup>Includes a clock receiver.

integration. The architecture exhibits a performance that compares favorably to the state-of-the art.

# Chapter 5

# A 22 Gb/s Time-Interleaved Optical

# Receiver with Integrating Front-end

This chapter improves upon the two previous receivers by employing what is called a low-bandwidth front-end. This type of front-ends allows the receiver to operate at higher data rates with a given front-end bandwidth. The novel nature of this implementation requires detailed analysis of this front-end. Thus, this chapter offers extensive design details of this implementation. The work presented in this chapter has been accepted for publication as a journal paper in the IEEE Journal of Solid-State Circuits [7].

#### 5.1 Introduction

As CMOS technology scaling is becoming more advanced, a larger number of transistors can be placed in a given area. One challenge in CMOS scaling is the analog front-end on the receiver side where, conventionally, a transimpedance amplifier (TIA) is used to convert the photocurrent into a voltage while providing a low input impedance to the photodetector. Conventional TIAs are bulky, power-hungry, and do not scale well with technology. This is because, at higher-speeds, a high gain core amplifier (or a multi-stage amplifier) is needed, leading to increased power consumption and resulting in TIAs with large size. Consequently, there has been a recent interest in developing optical receivers that do conventional TIAs low-bandwidth not require but instead techniques [34–36, 46–55]. Those low-bandwidth receivers can be divided into three categories: integrating front-end receivers [35, 46, 48–50], resettable receivers [34, 47, 51–53], and decision feedback equalizer (DFE) based receivers [36, 54, 55].

In integrating front-end receivers, a capacitive front-end is used to integrate the photocurrent and a decision is made based on the value of the integrated voltage. The receiver by Palermo et al. [46] employs a double sampling technique in which the integrated voltage difference is used to resolve the value of the bit. This approach suffers from consecutive identical digits (CID) induced issues that cause the voltage difference to decrease when identical bits are received. The receiver by Nazari et al. [35] mitigates the CID issue by introducing a dynamic offset modulation circuit. However, charge sharing

between the sampling capacitors and the input capacitance degrades the sensitivity of the Saeedi and Emani [49] resolved the issue of charge sharing by introducing a low-bandwidth TIA at the input of the chip decoupling the sampling capacitor from the input capacitance and thus improving sensitivity. The same group [48] employed advanced packaging techniques to reduce parasitic capacitance at the input, leading to further improvements in sensitivity. The second receiver category is resettable receivers employing a reset to discharge the capacitor before integrating the next bit [34, 47, 51–53]. This technique resolves the issues associated with CID at the cost of stricter timing requirements and an incomplete bit integration in [34, 47, 51, 52], leading to degraded sensitivity. The receiver in [53] addressed the incomplete integration period by interleaving four data paths but requires a wideband input stage, a common-mode feedback circuit (CMFB), and four clock phases for proper operation. The third approach in [36,54,55] uses DFE or speculative DFE to compensate for bandwidth reduction at the input. These approaches have either a critical timing requirement for the feedback or increased complexity with the number of taps in the speculative DFE implementation.

In this chapter, a resettable two-bit integrating front-end receiver is demonstrated in order to resolve issues associated with CID and charge sharing present in integrating front-end receivers. The proposed architecture also relaxes the timing requirements of the reset signal in resettable receivers, and requires, as a result of the use of optically interleaved inputs, only two quarter data rate clock phases (provided externally for this receiver). Thus,

there is no need for complex circuits to correct duty cycle and phase, which are critical for quarter-rate operation at high-speeds relying on quadrature clock generation [32]. Therefore, the proposed quarter clocking scheme is more energy-efficient and has a wider time margin compared to full-rate and half-rate clocking schemes [31].

The chapter is organized as follows: in section 5.2, integrating type front-end receivers and resettable receivers' architectures and their limitations are discussed. Section 5.3 details the proposed time-interleaved optical receiver with a two-bit integrating front-end. More specifically, the receiver architecture, operation, analysis of the front-end, noise analysis, and transistor implementation are presented. Section 5.4 discusses the experimental validation of the receiver. Section 5.5 summarizes the receiver and compares it to other published receivers. Section 5.6 discusses some of the silicon photonics structures that could be integrated with this receiver. Finally, section 5.7 concludes the chapter.

### 5.2 Low-bandwidth receiver architecture

## 5.2.1 Integrating receiver front-end

The front-end of the integrating receiver is shown in Fig. 5.1. The junction capacitance of the photodetector (PD) and input capacitance,  $C_{IN}$ , and a resistor, R, form a low-frequency pole at the input that integrates the photocurrent into a voltage signal. The voltage signal

is then sampled every unit interval (UI) using four clock phases and sampling capacitors,  $C_S$ , as shown in Fig. 5.2. The voltage difference,  $\Delta v_x$  (x=1,2,3,4), between samples is used to resolve the bit. If  $\Delta v_x > 0$ , the bit is determined to be a binary "1" and is considered as a binary "0" for  $\Delta v_x < 0$ . Assuming that the capacitor is fully discharged at the beginning of the process (i.e., t = 0),  $\Delta v_1$ ,  $\Delta v_2$ , and  $\Delta v_3$  can be written as the following for a sequence of three consecutive binary ones (i.e., 111):

$$\Delta v_1 = RI_{PD} (1 - e^{\left(\frac{-T_b}{RC_{IN}}\right)}) \tag{5.1}$$

$$\Delta v_2 = RI_{PD}(1 - e^{(\frac{-T_b}{RC_{IN}})})e^{(\frac{-T_b}{RC_{IN}})} = \Delta v_1 e^{(\frac{-T_b}{RC_{IN}})}$$
(5.2)

$$\Delta v_3 = \Delta v_2 e^{\left(\frac{-T_b}{RC_{IN}}\right)} \tag{5.3}$$

where  $I_{pd}$  is the peak photodetector current and  $T_b$  is the bit period. Note that  $\pm \Delta v_1$  is the largest possible difference between the two samples (i.e., $\Delta v_{max}$ ) when a binary 1 is received when the capacitor is discharged. The voltage difference becomes smaller as more identical bits are received challenging the receiver as the comparator will need to make a decision based on this smaller voltage difference. It is possible to mitigate this issue by introducing a dynamic offset modulation (DOM) [35] circuit. The DOM modifies the sense amplifier offset such that the inputs of the comparator are maintained to a constant voltage



Fig. 5.1: Simplified integrating front-end receiver architecture with the four clock phases  $\Phi_1$  to  $\Phi_4$ .

difference, as shown in Fig. 5.3. The offset is indicated by the red arrows in the figure. It can be shown that the voltage difference when DOM is employed is [35]:

$$\Delta v_{DOM} = \frac{1}{2} R I_{PD} (1 - e^{(\frac{-T_b}{RC_{IN}})}) = \frac{1}{2} \Delta v_{max}$$
 (5.4)

The achievable  $\Delta v_{DOM}$  is half of the maximum possible voltage difference,  $\Delta v_{max}$ . Thus, the comparators need to be able to resolve this reduced voltage difference at all times.

Charge sharing at the input is an issue in integrating front-end receivers. The total charge is shared between  $C_{IN}$  and four of the eight sampling capacitors,  $C_S$ . This degrades the receiver sensitivity. A photodiode with a junction (input) capacitance larger than the sampling capacitance can be used to mitigate this. This way, most of the charge is stored in the junction capacitance for subsequent sampling as expressed by:



Fig. 5.2: Voltage at the input of the sampling circuit when the sequence 1110 is received.  $\Delta v_x$  (x = 1,2,3,4) is the voltage difference between two consecutive samples.



Fig. 5.3: Basic operation of the dynamic offset modulation (DOM) in the receiver to compensate for CID. The red arrows indicate the offset generated by the DOM circuit to compensate the  $\Delta v$  shown in Fig. 5.2 and clamps the voltage difference to  $\pm (\Delta v_{max})/2$ .

$$Q_{IN} = \frac{C_{IN}}{C_{IN} + 4C_s} Q_{total} \tag{5.5}$$

where  $Q_{IN}$  is the charge stored in the input capacitance,  $C_{IN}$ , and  $Q_s$  is the charge stored in the sampling capacitors,  $C_s$ .  $Q_{total}$  is the total charge at the input. Based on (5.5), there is a minimum required size for the photodiode for proper operation. However, the signal to noise ratio (SNR) is inversely proportional to the size of the junction capacitance and a large junction capacitance degrades the sensitivity of the receiver. The SNR is approximated by:

$$\sqrt{SNR} \approx \frac{\frac{I_{PD}T_b}{C_{IN}}}{\sqrt{\frac{kT}{C_{IN}}}} = \frac{I_{PD}Tb}{\sqrt{C_{IN}kT}}$$
 (5.6)

where k is the Boltzmann constant, T is the temperature, and  $I_{pd}$  is the peak photocurrent. A solution to the charge sharing issue is to use a low-bandwidth TIA that decouples the junction capacitance from the sampling capacitance [49]. However, this requires an additional circuit at the input of the receiver.

# 5.2.2 Resettable receiver, current-amplifier-based receivers, and integrate-and-dump receiver

Resettable front-end receivers [51] and current-amplifier-based optical receivers resolve the processing issue related to CID and charge sharing. These design approaches also mitigate the potential issue of overloading of the integrator present in integrating front-end receivers



**Fig. 5.4:** Resettable receiver architecture operation and timing diagram showing the integration for 0.5 UI and reset for 0.5 UI.

by periodically resetting the input capacitance. The operation of a resettable receiver is illustrated in Fig. 5.4. This implementation uses a full-rate clock letting the input capacitor charge for 0.5 UI and then discharges for 0.5 UI. Only half of the maximum charge is stored across the capacitor affecting sensitivity. This implementation requires fast sample and hold and slicer circuits to sample and resolve the half-integrated bit before the capacitor is reset.

Current-amplifier-based optical receivers, shown in Fig. 5.5, alleviate these issues by introducing a dual-path current amplifier [34,47,52]. The cycle of operation lasts two UIs as shown in Fig. 5.6. This allows for more time for the latch to regenerate the output. Moreover, this type of receiver improves the integration time by allocating 0.75 UI for bit integration time instead of 0.5 UI. Only 25 % of the bit charge is lost due to the 0.25 UI reset pulse. The duration of the reset pulse is 10 ps at 25 Gb/s requiring careful design and proper phase alignment. Longer reset pulses degrade sensitivity while shorter ones are more



Fig. 5.5: Current-amplifier-based receiver architecture showing two interleaved paths and sampling using two phases  $(\Phi, \overline{\Phi})$  and a delayed version of the two phases  $(\Phi_d, \overline{\Phi_d})$ .

difficult to achieve and may result in an excess residual charge. Moreover, process variations can adversely impact such short pulses.

To address the incomplete integration period, the integrate-and-dump receiver, shown in Fig. 5.7, was proposed in [53]. The receiver has a wideband current amplifier at the input followed by four low-bandwidth transimpedance amplifiers, one in each of the four data paths. The four data paths are time-interleaved and have four phases of operation, shown in Fig. 5.8, described next for one of the data paths. The first phase is the internal reset that begins when  $\Phi_1 = 1$  and  $\Phi_2 = 0$ . In this phase, the input and the output of the amplifier are connected through a switch resetting those nodes. The next phase is the external reset phase when both switches are high. The integrate phase is next and starts when  $\Phi_1 = 0$  and  $\Phi_2 = 1$ . In this phase, the current from the current amplifier is integrated. Finally, in



**Fig. 5.6:** Timing and operation of the current-amplifier-based receiver showing the reset (0.25 UI), sample (0.75 UI), and hold phases (1 UI).

the hold phase when  $\Phi_1 = 0$  and  $\Phi_2 = 0$ , the integrated voltage is held for sampling by the latch.

This approach addresses the short reset pulse issue of the current-based-amplifier but requires wideband input stages and four clock phases to achieve a demux-by-four operation. The receiver also requires common-mode feedback (CMFB) circuit to ensure that the input and the outputs of the low-bandwidth TIAs are properly reset. Moreover, the hold period is 1 UI which may limit the speed at high-data rates.



Fig. 5.7: Integrate-and-dump receiver showing four interleaved paths. It utilizes four clock phases  $(\Phi_1, \Phi_2, \overline{\Phi_1}, \overline{\Phi_2})$ .



**Fig. 5.8:** Timing and operation of the of the integrate-and-dump receiver showing the four phases: internal reset, external reset, integrate, and hold.

## 5.3 Proposed time-interleaved receiver

#### 5.3.1 Architecture and operation

The time-interleaved two-bit integrating receiver proposed in this chapter is shown in Fig. 5.9 along with the timing diagram of operation in Fig. 5.10. On the transmitter side, the bit pattern  $B = [B_1 \ B_2 \ B_3 \dots B_N]$  is precoded into the data pattern  $D = [D_1 \ D_2 \ D_3 \dots D_N]$  using the following relationship:

$$D_k = B_k \oplus D_{k-1} \tag{5.7}$$

This precoding is derived from the five-level polybinary signaling for spectral efficient data links and adapted here for two-level signaling [56]. On the receiver side, bit pattern B can be recovered from received data pattern D using the following equation:

$$B_k = (D_k + D_{k-1}) \mod 2 = D_k \oplus D_{k-1}$$
(5.8)

Thus, the decoder on the receiver side is simply an XOR logic gate. The benefit of employing this algorithm is that the bit pattern B is recovered from the received signal D without considering bits from previous operation cycles. This coding prevents error propagation between cycles.

The optical input signal is divided using a passive optical splitter and interleaved in time



Fig. 5.9: Block diagram of the four proposed sub receivers and connection to the optical blocks with the delay scheme used. A single pole PD model is shown in the inset.



**Fig. 5.10:** Timing diagram showing the operation of the receiver and the two phases of operation.

using optical delay lines. The inputs of sub receivers 2 and 3 are delayed by a one-bit period,  $T_b$ , relative to sub receivers 1 and 4. This passive operation of splitting and delaying the light can be performed using silicon photonic (SiP) technologies, as described in section 5.6. The light is then coupled to a photodetector array by using four PDs which are wire bonded to the four sub receivers. At the front-end, to maximize the sensitivity of the receiver, the input capacitance,  $C_{IN}$ , is used as the integration capacitor without adding a capacitor on-chip. Only the top metal layer is used for the pads to reduce the input capacitance. The input capacitance is used simultaneously as both the integrating and sampling capacitor in order to resolve the charge sharing issue.

The operation of the receiver starts by integrating the photocurrent for two UIs over an initial fully discharged input capacitance. At the end of the integration phase, a switch is used to discharge this capacitor. The duration of the reset phase is two UIs. There are four possible waveforms over the integration period corresponding to the four possible combinations of the integrated bits: 00, 01, 10, and 11. These four waveforms are shown in Fig. 5.11 when

the bandwidth of the photodetector is high and in Fig. 5.12 when the bandwidth if the photodetector is limited and well below the data-rate. The resulting triangular overlay of all possibilities (i.e., eye-diagram) at the input, is shown on the bottom of both Fig. 5.11 and Fig. 5.12. This triangular waveform represents the symbols at the input of the front-end. Since the symbol rate is at half the data rate, all of the following stages can halve their bandwidth requirements compared to full bandwidth systems. Conventionally, an analog front-end requires a bandwidth of at least 70 % of the data rate. In the proposed low-bandwidth receiver, the bandwidth requirement can be relaxed down to 35 % of the data rate. The two-bit symbol is amplified using two voltage gain stages.

While there are four front-ends in the proposed receiver as opposed to one in a conventional receiver, the power consumption of the front-ends remains similar to that of a single front-end operating at full-rate. This is because the bandwidth required is halved. The first order voltage gain,  $A_V$ , of a single stage is given by:

$$A_v = \frac{g_m \times R_d}{1 + sR_d C_L} \tag{5.9}$$

where  $g_m$  is the small-signal transconductance,  $R_d$  is the drain resistance, and  $C_L$  is the load capacitance. Meanwhile, the bandwidth,  $\omega_s$ , is given by:

$$\omega_s = \frac{1}{2\pi R_d C_L} \tag{5.10}$$



Fig. 5.11: Voltage integration ( $\Delta v$ ) at the front-end for all possible input values when the bandwidth of the photodetector is higher than 0.7 of the data-rate. The bottom part shows an overlay of all  $\Delta v$  possibilities.



Fig. 5.12:  $\Delta v$  when the bandwidth of the photodetector is lower than 0.7 of the data-rate.

For a given gain, if the bandwidth is halved from  $0.7 \times$  data-rate to  $0.35 \times$  data-rate, then  $R_d$  can be doubled and  $g_m$  can be halved. Since  $g_m \propto \sqrt{I_d}$ , the power is reduced to  $1/4^{th}$  of that a conventional receiver in the  $0.35 \times$  data-rate case compared to a full-rate front-end. Note that the clock generation circuitry power consumption is expected to be lower in the proposed architecture due to the reduced number of clock phases required.

The integrated symbol is fed to two current mode logic (CML) flip-flops consisting of two CML latches. Each latch is clocked with two complementary quarter-rate clocks ( $\Phi$  and  $\overline{\Phi}$ ), providing a 1:4 demultiplexing operation. Only two quarter-rate clock phases are needed as opposed to four in conventional receivers to carry-out the 1:4 demultiplexing operation. This eliminates the need for duty cycle and quadrature detection/correction circuits found in quarter-rate clock generation circuits, while still benefiting from the wide timing margin offered by the quarter-rate operation. The two outputs of the two CML flip-flops are then fed to two differential pairs used as CML-to-CMOS converters followed by two D flip-flops. Finally, the two outputs are fed to an XOR logic gate for decoding as described by (5.8). The output of the XOR gate is then buffered to drive the measurement equipment, which represents a load  $R_L$  of 50  $\Omega$ .

## 5.3.2 Analysis of the integration

The front-end of the receiver integrates two bits leading to four possible waveforms (Fig. 5.12). The expression for these waveforms while considering the PD as a first-order

single-pole low-pass filter with a bandwidth given by  $\omega$  is derived. The model of the PD is shown in the inset of Fig. 5.9. The integrated voltage is given by:

$$\Delta v = \frac{1}{C_p} \int_0^{2T_b} i_{PD}(t)dt \tag{5.11}$$

The photocurrent for the case of  $D_1D_2=[00]$  is zero. For  $D_1D_2=[01]$  the photocurrent is given by:

$$i_{01}(t) = I_{PD}(1 - e^{-\omega(t - T_b)}), t \in [T_b, 2T_b]$$
 (5.12)

where  $I_{pd}$  is the PD peak current. The expression for the photocurrent for the case of  $D_1D_2 = [10]$ :

$$i_{10}(t) = I_{PD}(1 - e^{-\omega t}), t \in [0, T_b]$$

$$i_{10}(t) = I_{PD}(1 - e^{-\omega T_b})e^{-\omega(t - T_b)}, t \in [T_b, 2T_b]$$
(5.13)

Finally, for the case of  $D_1D_2 = [11]$ , the photocurrent is given by:

$$i_{11}(t) = I_{PD}(1 - e^{-\omega t}), t \in [0, 2T_b]$$
 (5.14)

Assuming an infinite extinction ratio, the average optical power,  $P_{avg}$ , is related to the peak current through the responsivity,  $R_{pd}$ , as:

$$I_{PD} = 2R_{PD}P_{avq} \tag{5.15}$$

At  $t = 2T_b$ ,  $\Delta v_{00} = 0$  for  $D_2D_1 = [00]$ . For  $D_1D_2 = [01]$   $\Delta v_{01}$  is given by:

$$\Delta v_{01} = \frac{2R_{PD}P_{avg}}{C_{IN}} \left( T_b - \frac{1}{\omega} (1 - e^{-\omega T_b}) \right)$$
 (5.16)

For  $D_1D_2 = [10]$ ,  $\Delta v_{10}$  is given by:

$$\Delta v_{10} = \frac{2R_{PD}P_{avg}}{C_{IN}} \left( T_b - \frac{1}{\omega} (1 - e^{-\omega T_b}) \times e^{-\omega T_b} \right)$$
 (5.17)

This equation takes into account the exponential decay of the current when transitioning from one to zero during the second bit period.

Finally, for  $D_2D_1 = [11]$ ,  $\Delta v_{11}$  is:

$$\Delta v_{11} = \frac{2R_{PD}P_{avg}}{C_{IN}} \left( 2T_b - \frac{1}{\omega} (1 - e^{-2\omega T_b}) \right)$$
 (5.18)

From (5.15), a photodetector with high responsivity and small junction capacitance is desirable for optimal sensitivity. Moreover, the current is integrated over a full unit interval  $(T_b)$  in (5.16) or two full unit intervals  $(2T_b)$  in (5.17) and (5.18) as opposed to the  $0.5T_b$  and  $0.75T_b$  used in resettable receiver front-ends and current-amplifier-based receivers, respectively. It can also be shown that  $\Delta v_{01}$  is equal to  $\Delta v_{11} - \Delta v_{10}$  from the three  $\Delta v$ 



Fig. 5.13: The ratio of  $\Delta v_{01}/\Delta v_{0.75T_b}$  vs photodetector bandwidth (in terms of bit duration) in the case of the proposed receiver over that of the current-amplifier-based receiver.

equations (5.16), (5.17), and (5.18).

For the cases where the integration period is  $0.75T_b$ ,  $\Delta v$  is:

$$\Delta v_{0.75T_b} = \frac{2R_{PD}P_{avg}}{C_{IN}} \left( 0.75T_b - \frac{1}{\omega} (1 - e^{-0.75\omega T_b}) \right)$$
 (5.19)

For a quantitative assessment of the improvement in  $\Delta v$ , the ratio of the  $\Delta v$  ( $\Delta v_{01}$  or  $\Delta v_{11} - \Delta v_{10}$ ) of the proposed over  $\Delta v_{0.75T_b}$  is plotted versus different PD bandwidths (in terms of data rate) in Fig. 5.13. This ratio is given by:

$$\frac{\Delta v_{01}}{\Delta v_{0.75T_b}} = \frac{\omega T_b - (1 - e^{-\omega T_b})}{0.75T_b - (1 - e^{-0.75\omega T_b})}$$
(5.20)

This analysis is verified through simulations using the single-pole photodetector model

shown in the inset of Fig. 5.9. This model is used to compute the simulation points presented in Fig. 5.13. Fig. 5.13 indicates that there is an improvement factor of 1.55 at a frequency  $(\omega/2\pi)$   $f=0.35/T_b$  (half the conventional bandwidth) corresponding to an improvement of 3.8 dB in receiver sensitivity. As expected, with higher photodetector bandwidth, the improvement factor decreases until it reaches the final value of 1.33 corresponding to the ratio of the integration periods (1 UI / 0.75 UI). As lower bandwidth PDs tend to be more cost-effective, the proposed receiver shows an improvement in sensitivity with those PDs as indicated by Fig. 5.13. Moreover, there is a factor of two improvement over the DOM integrating front-end receiver corresponding to a 3-dB optical sensitivity improvement as indicated by (5.4), excluding splitting and delay line losses. This is because the front-end always resets before integrating.

# 5.3.3 Noise in the two-bit integrating front-end receiver and input capacitance impact on SNR

The SNR ratio at the input, taking into account the noise variances of the two voltage gain stages ( $\sigma_{A1}$  and  $\sigma_{A2}$ ) as well as the comparator ( $\sigma_C$ ), which were ignored in (5.6), is given by:

$$SNR = \left(\frac{\Delta v_{01}}{\sqrt{\frac{kT}{C_{IN}} + \sigma_{A1}^2 + \sigma_{A2}^2 + \sigma_C^2}}\right)^2$$
 (5.21)

In the proposed approach, the gain of the first two gain stages is increased at the expense of bandwidth. The main noise contributions come from the two voltage gain stages and the input capacitance at the input. The noise of the comparator is attenuated by the gain of the two voltage gain stages, as indicated by Friis formula for noise, and, thus, can be ignored. Consequently, (5.21) can be approximated by:

$$SNR \approx \left(\frac{\Delta v_{01}}{\sqrt{\frac{kT}{C_{IN}} + \sigma_{A1}^2 + \sigma_{A2}^2}}\right)^2 \tag{5.22}$$

The value of the square of the denominator,  $kT/C_{IN} + \sigma_{A1}^2 + \sigma_{A2}^2$ , was simulated within the bandwidth of the receiver and for different  $C_{IN}$ . Moreover,  $\Delta v_{01}$  is simulated with a peak photocurrent of 100  $\mu A$  at 20 Gb/s. The simulated SNR is plotted in Fig. 5.14. As expected, smaller junction and parasitic capacitances result in a better SNR, which enhances sensitivity. To reduce the capacitance, the parasitic capacitance at the input is reduced by removing the intermediate metal layers in the bond pads and reducing their size. Indeed, packaging can have a significant impact on sensitivity. Wire bonding is the most common optoelectronic packaging technique and it is used here. Flip-chip of the electronic receiver onto the photonic chip can be used with thin copper pillars to significantly improve the sensitivity [48,49].



Fig. 5.14: Simulated SNR versus  $C_{IN}$  showing improvement with a smaller capacitance.

#### 5.3.4 Detailed circuit implementation

A detailed circuit implementation of one of the sub receivers is shown in Fig. 5.15. An NMOS switch is used to discharge the input capacitance at the end of the integration cycle. A shorted PMOS transistor is used for carrier injection cancellation of the NMOS transistor. This is done to avoid residual charge in the integration capacitor. The size of the PMOS is half the size of the NMOS. The size of the NMOS switch is minimized to reduce its contribution to the input capacitance while being kept large enough to reliably discharge the input parasitic capacitance. The NMOS and the PMOS switches are clocked by the two complementary clock phases. The two-bit integrated voltage is then amplified by two inductively-peaked cascode voltage gain stages. Inductive peaking increases the gain-bandwidth product of the receiver. This enables the receiver to provide more gain for

a given bandwidth as compared to when inductive peaking is not used, and thus improves the sensitivity of the receiver. The two stages are AC coupled to allow for optimal biasing. The low cut-off frequency of the AC coupling capacitor is designed carefully to allow for PRBS 7 and PRBS 15 measurements. The value of the coupling capacitor is kept small (80 fF) to reduce capacitive parasitic loading at the output of the gain stages. То compensate for the small value of the coupling capacitor  $C_C$ , the value of R is increased, and a low-cut-off frequency of 750 kHz is maintained. It can be shown that a large value of R and a small value of  $C_C$  result in a negligible noise contribution to the input-referred noise and does not affect the sensitivity of the receiver [5]. The low cut-off frequency ensures that the bit patterns [01] and [10] completely overlap as illustrated in Fig. 5.11, even at data rates as low as 5 Gb/s assuming that the PD bandwidth is high enough. Additionally, AC coupling prevents low-frequency supply noise injected at the output node of the first amplifier stage to be injected into the second stage. Moreover, any in-band noise injected at the output of the second stage is divided by the gain of the two stages when referred to the input.

Fig. 5.16 shows the simulation results for the small-signal voltage gain of the two gain stages. The gain stages have a bandwidth of 12 GHz with a peak gain of 11 dB. The bandwidth is overdesigned to be 0.4 of 30 Gb/s. In practice, the receiver is limited to 22 Gb/s because of the limited switching speed of the CML latches due to the technology node. The bandwidth can be relaxed to 7.7 GHz (0.35×22 Gb/s) without impacting the



Fig. 5.15: Detailed circuit implementation of one of the four sub receivers in the proposed receiver. The input is wire bonded to a photodetector.

functionality of the receiver. An important design consideration is the linearity of the gain stages as the receiver needs to process multilevel signals. Fig. 5.17 shows the simulated signal power at the output of the two amplifier stages versus the power at the input. The input-referred 1-dB compression point is at -13.4 dBm. Moreover, the input-referred third-order intercept point (IIP3) is simulated using the two-tone test and is -4.15 dBm as shown in Fig. 5.18. The IIP3 corresponds to a peak voltage of 200 mV (138 m $V_{rms}$ ) at the input. The calculated optical power required to generate this voltage is -3.4 dBm assuming a PD responsivity of 0.7 A/W, a total input capacitance of 160 fF, and that a pair of '1's is received at 20 Gb/s. Considering that this optical power is relatively high, the receiver is considered to have good linearity, especially since it needs to process only two integrated bits. With a simulated input-referred voltage noise of 0.9 m $V_{rms}$  and an IIP3 of 138 m $V_{rms}$ , the spurious-free dynamic range is calculated to be 29 dB using (5.23).

$$SFDR = \frac{2}{3} \times [IIP3 (dBm) - Noise Power (dBm)] = 29 dB$$
 (5.23)

The amplified voltage is fed to two CML flip-flops. Each flip-flop consists of a master CML latch followed by a slave CML latch. A CML topology minimizes kickback noise in comparison to CMOS latches. The two voltage gain stages further reduce residual kickback noise from the latches. The CML latches used here are clocked with quarter-rate clocks allowing more time for the latches to fully regenerate. In this prototype, each of the two CML flip-flops is fed with two externally applied reference voltages,  $V_{ref1}$  and  $V_{ref2}$ , for



Fig. 5.16: AC simulation gain of the amplifier stages.



**Fig. 5.17:** The output power of the two amplifier stages versus the input power. The input-referred 1-dB compression point is -13.4 dBm.



**Fig. 5.18:** Two-tone test showing the fundamental and the third-order harmonic powers. The IIP3 is at -4.15 dBm.

comparison with the signal. This allows the tuning of the comparators in each of the sub receivers to account for process variations. Two differential pairs are used at the outputs of the slave CML latch to further boost the output voltages and interface with two digital D flip-flops.

The differential pairs operate at a quarter data rate and are designed to have a high gain at this low speed, consuming less power. One output of each of the two differential pairs is connected to a D flip-flop. The two outputs of the D flip-flops are connected to an XOR gate for decoding according to (5.8). Finally, the output of the XOR gate is connected to a buffer to drive the measurement equipment.

## 5.4 Experimental results

The receiver is implemented in a 65 nm CMOS technology. Fig. 5.19 shows a micrograph of the receiver along with the wire-bonded  $1\times4$  photodetector array.



**Fig. 5.19:** Micrograph of the fabricated chip occupying 1.5 mm  $\times$  1.5 mm and wire-bonded to a 1  $\times$  4 PD array with a 250  $\mu m$  pitch.

The receiver is measured in two steps: 1) a single sub receiver measurement illustrated in Fig. 5.20, and 2) a full system measurement shown in Fig. 5.21. The continuous light (CW) from a 1550 nm laser is connected to a polarization controller (PC) and then modulated

using a Mach–Zehnder modulator (MZM) at 10 Gb/s, 16 Gb/s, and 22 Gb/s with a PRBS 7 or PRBS 15 sequence from a pulse pattern generator (PPG). The output power of the modulator is controlled using a variable optical attenuator (VOA). A mechanically tunable optical delay line (ODL-330 by Santec) is used to align the system clock and the data. This delay line has a delay tuning range of 400 ps and a resolution of 0.2 ps. Thus, it was possible to have exactly one unit-interval delay in these measurements. The delay line is followed by a 90:10 power splitter where 10 % of the output is connected to a power meter (PM) for monitoring while 90 % of the signal is connected to one of the photodetectors in the 1  $\times$  4 PD array (DO309\_20um\_C3\_1x4 by Global Communication Semiconductors, LLC). A bit error rate (BER) measurement is done by changing the optical power applied to the chip through the VOA and recording the BER for each input power. The eye diagram is recorded with a digital communication analyzer (DCA). The measured BER curves are shown in Fig. 5.22 and Fig. 5.23 for PRBS 7 and PRBS 15 inputs, respectively. The electronic receiver achieves an average sensitivity of -7.8 dBm at 22 Gb/s with a PRBS 7 sequence and -6.7 dBm with a PRBS 15 sequence for a BER less than  $10^{-12}$ . The extinction ratio is measured to be 8 dB, and thus, the corresponding optical modulation amplitude (OMA) is calculated to be -6.4 dBm OMA for a PRBS 7 sequence. The measured quarter-rate eye output diagram at 5.5 Gb/s is shown in Fig. 5.24.

To validate the tolerance of the receiver to timing variations in the optical delay lines, the bathtub curve is measured at 22 Gb/s as shown in Fig. 5.25. The receiver shows a timing



Fig. 5.20: Single sub receiver measurements setup.

error tolerance of approximately 0.1 UI (i.e., 4.5 ps) at 22 Gb/s. Note that it is possible to reliably design integrated optical delay lines with a delay error of less than 3 ps, as outlined in section 5.6.

The proper operation of the complete system is confirmed by verifying correct descrialization, crosstalk levels, and measuring power consumption. In the setup shown in Fig. 5.21, the modulated light of the MZM is amplified using an erbium-doped fiber amplifier (EDFA). The output of the EDFA is then filtered using a bandpass optical filter centered around 1550 nm followed by a VOA. The output of the VOA is connected to a 10:90 coupler for monitoring after which the 90 % output is sent to a 1:4 optical splitter with a 6.5 dB insertion loss. Each of the four outputs of the splitter is connected to a



Fig. 5.21: Full system measurement setup.

mechanically optical delay line (ODL-330 by Santec) with a reported insertion loss of 1.5 dB. The delays are adjusted to  $T_b$  according to the scheme shown in Fig. 5.9. In a final implementation, these delay lines are replaced with silicon-photonic delay lines as described in section 5.6. A fiber array couples the light to the 1×4 photodetector array. The BER is measured at 22 Gb/s with a PRBS 7 input and shown in Fig. 5.26 in comparison with single-channel measurements. There is a degradation of 1.3 dB due to crosstalk between the PDs. To mitigate this, the on-chip spacing between the PDs could be increased, or ground bond wires acting as shields could placed between the PDs. The speed of the receiver is limited to 22 Gb/s by the switching speed of the CML latches as opposed to the front-end. With implementation in a more advanced technology node or in a monolithic process, the operating speed is expected to improve.

The circuit dissipates 87.6 mW from a 1.09 V power supply. 31.6 mW or 36 % of the



Fig. 5.22: BER measurements for PRBS 7 input.



Fig. 5.23: BER measurements for PRBS 15 input.



Fig. 5.24: 5.5 Gb/s output quarter-rate eye diagram.



Fig. 5.25: Bathtub measurements at 22 Gb/s.



**Fig. 5.26:** BER curve comparing single channel operation with full system operation and crosstalk penalty at 22 Gb/s and with a PRBS 7 sequence.

power is consumed by the core of the receiver and both clock phase buffers. 56 mW (64 %) is dissipated by the output buffer, required to drive the 50  $\Omega$  terminated measurement equipment. The resulting energy-efficiency excluding the output buffer is 1.43 pJ/bit.

# 5.5 Discussion

The proposed technique successfully eliminates charge sharing and CID issues associated with integrating-type receivers and the need for short reset pulses present in current amplifier-based receivers. It also allows for an integration period of more than 1 UI as opposed to 0.75 UI in resettable receivers. It also uses only two clock phases to perform a demux-by-four as opposed to four required in other architectures. There are, however, some tradeoffs

present in the proposed technique. The first tradeoff is in the system-level additional optical insertion loss. In this initial demonstration, the excess optical losses are of 8 dB with 6.5 dB associated with splitting of the optical signal and 1.5 dB from the optical delay lines. It is possible to reduce these losses by implementing the splitter and the delay lines using SiP technology. As explained in section 5.6, the delay line loss can be as low as 0.07 dB and the optical couplers can be designed to balance the power at the PDs. Thus, the total optical loss could be reduced to 6 dB.

A full bandwidth system utilizes twice the bandwidth required by the proposed system, and thus has twice the integrated noise. The sensitivity of the proposed system is, theoretically, 3 dB below a full-bandwidth system operating at the same data rate due to the excess insertion loss of the delay lines. Moreover, as indicated by (5.20), the front-end boosts the sensitivity of the electronic part of the receiver by 3.8 dB when the bandwidth is 0.35×data-rate in comparison to a current amplifier-based receiver and by 3 dB in comparison to the integrating front-end receiver. As a result, the sensitivity of the proposed receiver is only 2.2 dB and 3 dB below these systems, respectively, taking into account the 6 dB optical losses. To compensate for this degradation in sensitivity, advanced forward error correction codes can be used [37,57]. Alternatively, as indicated by (5.22), reducing the junction capacitance of the photodetector and the parasitic capacitance at the input can have a significant impact on the sensitivity. It is estimated that the front-end here has a total input capacitance (junction + parasitic) of 160 fF. Flip-chip packaging

with thin copper pillars [48,49] provides a total input capacitance of 33 fF and can be used for better sensitivity. Finally, if the receiver is implemented in a monolithic process, the capacitance associated with the bond pad is removed, improving the SNR and signal power as indicated by Fig. 5.14 which mitigates the sensitivity trade-off. The improvement in signal power in this case means that the voltage amplifiers can relax their gain leading to improved power consumption and better energy efficiency. To summarize this tradeoff, the receiver offers a reduced complexity by removing the clock generation circuits, which also leads to reduced power consumption, at the expense of degraded sensitivity.

A second trade-off is the fixed speed of the operation set by the optical delay of the delay lines. This can be mitigated by implementing electronically tunable delay lines in SiP [29].

A third trade-off is the additional area on the chip required for the bond pads needed to connect to the four photodetectors. This can be mitigated by implementing the receiver in a monolithic process such as the one offered by GLOBALFOUNDRIES [58] where bond pads are not needed, similarly to work presented in [50]. The reported area of PDs in SiP is  $25 \ \mu m \times 8 \ \mu m$  in [59], which is negligible in this case.

The proposed receiver, which is designed to be used as a source synchronous receiver, can be adapted for use alongside a clock and data recovery (CDR) circuit such as the one proposed in [34]. The delay lines simplifies the design of the oscillator in the CDR because only one clock phase needs to be recovered.

Tables 5.1 and 5.2 provide a performance comparison with the state-of-the-art. The

electronic front-end of the receiver achieves better sensitivity than [35, 46] that need to maintain a large capacitance at the input to mitigate the issue of charge sharing. Moreover, the receiver in [46] uses 8B/10B encoding to bypass the CID issue as opposed to PRBS sequences. The sensitivity of the proposed receiver is worse than [48,49] which uses advanced packaging techniques for better SNR and sensitivity.

The receiver in [47] achieves better energy efficiency and sensitivity, but includes a delay circuit that needs to be carefully designed and tuned across different process corners to achieve the required delay of 10 ps. [34] is the same receiver as in [47] but includes a CDR circuit that consumes more power and this reduces the energy efficiency and sensitivity. From this comparison, it can be seen that source-synchronous receivers offer better receiver energy efficiency, at the cost of the extra clock receiver circuit and clock connection.

The infinite impulse response decision (IIR) DFE receiver in [36] employs a low-bandwidth TIA followed by an IIR DFE to compensate for the bandwidth reduction. IIR DFE receivers, however, are challenged by the critical timing requirements of the feedback loop that needs to settle within 1 UI which could limit their use at higher speeds. Additionally, the IIR nature of the feedback could result in an error propagation issue in the case of incorrect error detection, especially if the magnitude of feedback is increased resulting in a burst of errors. This limitation also applied to finite impulse response (FIR) receivers with many taps. To address both the critical timing requirements and the error propagation challenges, the receiver in [54] uses a low-bandwidth TIA with a bandwidth of

**Table 5.1:** Performance summary and comparison (part 1)

|                                                                               | m: 1                          | [46]            | [0]                   | [40, 40]                                          | [9.6]                    | [45]                    | [9.4]                            |
|-------------------------------------------------------------------------------|-------------------------------|-----------------|-----------------------|---------------------------------------------------|--------------------------|-------------------------|----------------------------------|
|                                                                               | This work                     | [46]            | [35]                  | [48, 49]                                          | [36]                     | [47]                    | [34]                             |
| $\begin{array}{c} {\rm CMOS\ node} \\ {\rm (nm)} \end{array}$                 | 65                            | 90              | 65                    | 28                                                | 65                       | 40                      | 40                               |
| $\begin{array}{c} \text{Data-rate} \\ \text{(Gb/s)} \end{array}$              | 22                            | 16              | 24                    | 25                                                | 20                       | 25                      | 25                               |
| Sensitivity (dBm) at BER of $10^{-12}$                                        | $-7.8^{1}$ $-6.2^{1}$ OMA     | -5.4            | -4.7                  | -14.9                                             | $-5.8^{1}$ OMA           | -10.8                   | -8.7                             |
| Data type                                                                     | PRBS 7,15                     | 8B/10B          | PRBS 7,9,15           | PRBS 7,9,15                                       | PRBS 7                   | PRBS<br>15              | PRBS<br>31                       |
| Input capacitance (fF)                                                        | 160                           | 440             | 250                   | 33                                                | 200                      | $100^{4}$               | $100^{4}$                        |
| $\begin{array}{c} {\rm Power} \\ {\rm consumption} \\ {\rm (mW)} \end{array}$ | 31.6                          | 23              | 9.6                   | 4.25                                              | 14.2                     | 27.6                    | 27.6                             |
| Energy efficiency $(\mathrm{pJ/bit})$                                         | 1.43                          | $1.44^{2}$      | $0.4^{3}$             | 0.17                                              | 0.71                     | 1.13                    | 2.1                              |
| Area (mm <sup>2</sup> )                                                       | $4 \times 0.0812$             | 0.105           | 0.0028                | 0.0018                                            | 0.027                    | 0.007                   | 0.09                             |
| Receiver type                                                                 | Two-bit integrating front-end | Double sampling | Double sampling + DOM | Double<br>sampling<br>+ DOM<br>+ Low<br>BW<br>TIA | Low<br>BW<br>TIA+<br>DFE | CA<br>based<br>receiver | CA<br>based<br>receiver<br>+ CDR |

 $<sup>\</sup>overline{\begin{tabular}{l}^{1} Optical losses not considered.} \\ {\begin{tabular}{l}^{2} Power consumption of front-end only.} \\ {\begin{tabular}{l}^{3} Without clock generation and SR latches.} \\ \end{tabular}$ 

<sup>&</sup>lt;sup>4</sup>PD capacitance reported only.

**Table 5.2:** Performance summary and comparison (part 2)

|                                                                  | This work                      | [54]                                       | [55]                                       | [53]                       | [60]                                                | [5]                          |
|------------------------------------------------------------------|--------------------------------|--------------------------------------------|--------------------------------------------|----------------------------|-----------------------------------------------------|------------------------------|
| CMOS node (nm)                                                   | 65                             | 14                                         | 14                                         | 28                         | 16                                                  | 65                           |
| $\begin{array}{c} \text{Data-rate} \\ \text{(Gb/s)} \end{array}$ | 22                             | 32                                         | 64                                         | 20                         | 50                                                  | 12.5                         |
| Sensitivity (dBm) at BER of $10^{-12}$                           | $-7.8^{1}$ - $6.2^{1}$ dBm OMA | -11.7<br>OMA                               | -5.5 OMA                                   | -8.6<br>OMA                | -10.9<br>OMA                                        | $-4^{1}$ - $3.4^{1}$ dBm OMA |
| Data type                                                        | PRBS 7,15                      | PRBS 31                                    | PRBS 7                                     | PRBS 7                     | PRBS 7                                              | PRBS 7                       |
| Input<br>capacitance<br>(fF)                                     | 160                            | 69                                         | 69                                         | 200                        | 90                                                  | 160                          |
| Power consumption (mW)                                           | 31.6                           | 27.6                                       | 14                                         | 14                         | $97^{2}$                                            | 24.4                         |
| Energy<br>efficiency<br>(pJ/bit)                                 | 1.43                           | 1.4                                        | 1.4                                        | 0.7                        | $1.94^{2}$                                          | $1.94^{2}$                   |
| Area $(mm^2)$                                                    | $4\times0.0812$                | 0.046                                      | 0.028                                      | 0.005                      | 0.27                                                | $4\times0.1185$              |
| Receiver type                                                    | Two-bit integrating front-end  | Low Bandwidth TIA + 1- tap speculative DFE | Low Bandwidth TIA + 1- tap speculative DFE | Integrate-<br>and-<br>dump | Conventional with T-Coils for bandwidth enchantment | receiver with passive        |

 $<sup>^{1}</sup>$ Optical losses not considered.  $^{2}$ Receiver + clock generation.

0.22×data-rate and a one-tap speculative DFE to compensate for bandwidth reduction to achieve 32 Gb/s. Speculative DFE allows for the critical timing required to be relaxed to 4 UI as opposed to 1 UI in conventional DFE. The work reported in [55] is similar to [54] but designed for 64 Gb/s, consumes more power, and has lower sensitivity due to the increased data-rate while maintaining the same energy efficiency. The receiver in [55] was tested with PRBS 7 as opposed to PRBS 31 as in [54]. By using one speculative DFE, the error propagation issue of the IIR DFE receivers is mitigated. However, the speculative DFE taps complexity increases exponentially with the number of taps. Both [54,55] are implemented in 14 nm FinFET technology to achieve higher data rates. The critical timing requirement and increased complexity are avoided in the proposed receiver as the integrating nodes are reset to ground after each cycle. The energy efficiency is similar to the proposed receiver despite the technology node gap.

The integrate-and-dump receiver in [53] removes the feedback used in [36] and replaces it with a reset operation effectively addressing the critical timing requirement and the potential error propagation. However, it requires a wideband current amplifier in the front-end and four clock phases. Since the proposed system uses optical blocks to replace clock phase generation, further power saving in clock generation is possible at the cost of extra optical insertion loss. This receiver in [53] is implemented in CMOS 28 nm and achieves an energy efficiency of 0.7 pJ/bit at 20 Gb/s. The proposed receiver is implemented in CMOS 65 nm, yet has higher speed of operation of 22 Gb/s, which outlines the benefit of the proposed

architecture, potentially due to the reset duration of two UI. The gap in energy efficiency could be attributed in part to the technology node difference.

The receiver in [60] is a conventional full bandwidth receiver implemented in a 16 nm CMOS FinFET technology node and exploits T-coils to improve the bandwidth and achieve a superior speed of 50 Gb/s. The inductors occupy a large area on the IC chip, increasing cost of the design. While the proposed receiver also employs peaking inductors to improve the gain-bandwidth product in the 65 nm technology node used, the low bandwidth frontend lends itself well to an inductor-less implementation potentially saving area and cost. Additionally, as this is not a low-bandwidth receiver, the energy efficiency of the receiver including clock generation is the lowest.

Finally, the receiver in [5] is a 12.5 Gb/s 1.93 pJ/bit conventional full bandwidth receiver with a sensitivity of -3.4 dBm OMA. This receiver uses a conventional common-gate input stage and optical delay lines to replace clock generation. The proposed receiver achieves an all-around better performance than [5] thanks to the two-bit low bandwidth integrating front-end.

Overall, the proposed receiver is a robust, low-complexity alternative that is capable of sustaining long-running identical digits while maintaining a relatively high voltage difference without introducing an open-loop delay for the reset pulse. Such delay is susceptible to process variations. Moreover, the proposed receiver has better sensitivity compared to other receivers in similar technology nodes but can also benefit from scaling and more advanced

technology nodes where the smaller input capacitance enhances sensitivity.

# 5.6 A note on the silicon-photonic structures compatible with the two-bit integrating front-end receiver

This section describes a proposed SiP structure that integrates the functionality of the  $1\times4$  photonic splitter, the optical delay lines, and the photodetectors array onto a single compact chip for integration with the receiver presented. This architecture is based on the designs introduced in [18, 39].

The layout of the proposed structure is shown in Fig. 5.27. The proposed SiP chip consists of a single input grating coupler used to couple the light to the chip followed by a 50:50 splitter. Each of the two outputs of the splitter is followed by two directional-couplers each with a coupling ratio of 49:51 to compensate for the propagation loss in the delay lines. Two of the outputs, labeled Out 1 and Out 4, are directly routed to two photodetectors, while the other two, labeled Out 2 and Out 3, are routed through an optical delay line with a delay corresponding to the period of one bit  $(T_b)$ . The delay lines are made of low-loss silicon waveguides with a core cross-section of 220 nm  $\times$  3  $\mu$ m and have a length of 3.63 mm that provides a delay of approximately 45 ps, which corresponds to one bit at 22 Gb/s. The reported loss for the 220 nm  $\times$  3  $\mu$ m optical waveguide is 0.2 dB/cm and, therefore, the



**Fig. 5.27:** Layout of a proposed split-delay SiP structure including a grating coupler, three directional couplers acting as power splitters, two one-bit delay lines and four photodetectors.

and Out 3 is compensated by adjusting the coupling ratio of the two directional couplers to 49:51. Thus, the optical power reaching each of the four photodetectors is the same. The delay lines have a rectangular layout to minimize the area of the chip. In a final integrated system, each of the four detectors can be wire bonded (or flip-chipped) onto the receiver.

The proposed receiver can be wire bonded to the structure proposed in Fig. 5.27 in a similar approach to the one proposed in [48]. The layout of the SiPh chip of [48] is shown in Fig. 5.28 for illustration purposes, where the IC chip can be placed on top of the SiPh chip and under bump metallizations (UBMs) connect the two chips. The parasitics expected are



Fig. 5.28: SiPh chip with Under Bump Metallization (UBMs) used to connect to the IC chip in [48].

the parasitic capacitances of the PD and the UBMs. The PD capacitance is estimated to be 8 fF, and the expected parasitic UBM capacitance is 25 fF. This parasitic capacitance is lower than the 80 fF capacitance introduced by the bond pad when wire bonding is used. It is expected that the sensitivity will improve due to the reduction in the parasitic capacitance, as indicated by Fig. 5.14.

A monolithically integrated SiP with CMOS can also be considered [50, 61, 62]. Such



**Fig. 5.29:** Electronically tunable delay lines consisting of a ring resonator and an MZI delay elements [29].

SiP delay lines provide accurate and reliable delay with an error below 3 ps, and their size conveniently decreases at higher data rates of operation. This makes this approach less complex to implement [18]. It is also possible to replace the fixed delay lines in the proposed structure with electronically tunable lines to support various data rates [29]. The delay line in [29], shown in Fig. 5.29, consists of ring resonator delay element for fine delay tuning with a continuous delay of 23 ps and Mach–Zehnder switches to select the delay path. There are eight MZI switches followed by seven binary delay stages. This delay line has a continuous delay up to 1 ns. The insertion loss of the delay lines ranges from 8.5 dB at 10 ps to 11.3 dB at 1 ns. This is compared to 0.1 dB for fixed delay lines. A more reasonable approach to reduce insertion loss is to use only the ring resonator which has an insertion loss of 1.1 dB when the delay is 10 ps. This will allow the receiver to dynamically operate from 22 Gb/s down to around 18.2 Gb/s with reasonable insertion losses.

If electronically tunable delay lines are employed, then the insertion loss may vary for different data rates. It may then be necessary to use electronically tunable directional couplers, such as the two shown in Fig. 5.30, to adjust the coupling ratio such that the received power is the same at the outputs of both the no-delay and one-bit delay lines.



Fig. 5.30: Layout of a proposed split-delay SiP with directional couplers to ensure equal power at the output.

It should be noted that these implementations and integration with the receiver are suggested for future work.

## 5.7 Conclusion

This chapter presented a 22 Gb/s receiver with an average -7.8 dBm sensitivity and an energy efficiency of 1.43 pJ/bit. The receiver exploits photonic blocks to remove clock phase generation circuits for reduced power consumption. The receiver aims to address some of the issues present in integrating receiver and current-amplifier-based receivers, mainly charge sharing and short reset pulses without introducing a TIA circuit while avoiding the critical timing and complexity associated with DFE and speculative DFE based receivers.

The proposed receiver shows great potential at higher speeds of operation when clocking is becoming more demanding and requires duty cycle and quadrature error detection circuits. Such circuits are not needed in this system. The receiver thus provides a compelling advantage in terms of robustness and reduced complexity. With technology scaling and more advanced technology nodes, low bandwidth receivers such as the one proposed are desirable as they remove the bulky, power-hungry TIAs. Thus, the proposed receiver is suited for applications such as high-density data center interconnects.

This concludes the first theme of this thesis.

# Chapter 6

# Demonstration of Inter-chip

# Transmission with On-Chip Antennas

# in SiP

This chapter is the first of three chapters that cover the design of passives in silicon photonics. This chapter covers the design and the measurements of a 15 GHz monopole antenna in silicon photonics. Moreover, wireless inter-chip data transmission is demonstrated experimentally. The work presented in this chapter has been published as a journal paper in the IEEE Photonics Technology Letters [8].

## 6.1 Introduction

Recently, there has been an increasing interest in the implementation of front-end electronics with optical passives in silicon-photonics (SiP) [5,9,18,59,63]. There are several examples of such integration. A low-pass RC at the output of a photodetector in SiP can potentially be used as an envelope detector [9]. An RC matching network matches the output impedance of a photodetector to a 50  $\Omega$  load for optimum interface to a test and measurement equipment or an antenna [63]. Optimized on-chip peaking inductor extends the bandwidth extension and responsivity of photodetectors [59]. Finally, an optical passive delay line replaces the clock generation blocks of a conventional electronic receiver [5, 18]. This trend is possible partly due to the SiP platform proving to be a cost-effective option to host bulky electronic passives compared to high-end complementary metal-oxide-semiconductor (CMOS) platforms. More importantly for RF operation, the low conductivity of the substrate in the SiP stack allows for the implementation of electronic passives with high-quality factors. This low substrate conductivity is favorable for antennas preventing electromagnetic waves from being dissipated as heat in the substrate, thus improving the overall efficiency of the system [21]. Moreover, radio-over-fiber (RoF) applications are gaining commercial interest [64]. As such, researchers are developing critical components needed to support these applications [65–69] including mode-locked laser diode for RF frequency operation [65], high output power photodetectors [65–67], photomixers [65, 68], photonic transmitters [69], photonic-antenna emitters [70, 71], and integrated photoreceivers [72].

Following these new opportunities enabled by the emergence and advancement in photonic integration, a planar monopole antenna for inter-chip RF communication and RoF applications leveraging a commercial SiP technology platform is developed. The monopole antenna consists of one half of a dipole antenna with a ground plane acting as a mirror. SiP allows for monolithically integrating the antenna with the photodetector for a compact design with improved efficiency by the virtue of high substrate resistivity. Impedance matching between the photodetector and the antenna achieves optimal power transfer [73,74]. The matching network can be removed with the photodetector close to the antenna further motivating the design of an antenna in silicon photonics.

The eventual goal of the antenna design presented in this chapter is the eventual integration of integrated antenna-emitters in the SiPh technology stack. This will allow SiPh chips to communicate with a microcontroller or a central processing unit wirelessly. The feasibility of this approach was studied and published in [16]. In summary, it was found that integrated antenna-photodetector emitters are feasible as long as an optimized photodetector, such as the one used in [59], is used. This chapter will focus on the antenna part.

In this chapter, a 15 GHz monopole antenna is designed and measured. The antenna is fabricated in the commercially available silicon photonics process offered by Advanced Micro Foundry (AMF). Section 6.2 describes the structure of the designed antenna with the HFSS simulation results. Section 6.3 describes the S-parameter measurements validating

the performance of the antenna. A pair of antennas on two different chips experimentally achieve inter-chip data communication. Section 6.4 discusses the results. Finally, section 6.5 concludes the chapter.

# 6.2 Antenna design

The layout of the designed antenna is shown in Fig. 6.1a. Here, a monopole antenna is chosen for its small size and the expected relatively uniform radiation pattern in the horizontal direction making it suitable for inter-chip communication. The fabrication layer of the AMF fabrication process is shown in Fig. 6.1b and consists of two metal layers used to implement the antenna.



**Fig. 6.1:** (a) Layout of the fabricated antenna. (b) Stack layers of the AMF SiP fabrication process used to fabricate the antenna.

A quarter-wavelength monopole antenna is a single-ended fed antenna one-half the size

of a dipole antenna. The ground plane, which acts as a mirror, creates a field above the ground that is identical to one generated by a dipole antenna but with no field beneath it. The irradiated field can be calculated by image theory, which states that the fields above a perfectly conducting plane from a primary source are found by summing the contributions of the primary source and its image [75]. This means that a monopole structure orthogonal to the ground plane behaves as a dipole antenna, but with less radiation impedance since it is single-ended. The body of the antenna is implemented on the top metal (M2), while the ground plane is implemented on both metal layers M1 and M2 connected through a via. The SiP process features a relatively thin undoped silicon substrate (120 µm) below the insulator (labelled Box in Fig. 6.1b). One advantage of the thin insulator is that post-processing, namely substrate etching, is not required to achieve good air radiation. Indeed, substrate etching is required when the process used features a thick substrate with high permittivity that leads to the electromagnetic waves radiating towards the silicon substrate instead of air [76]. Moreover, if the substrate is conductive, electromagnetic power is dissipated as heat degrading the radiation efficiency of the antenna [21]. This highlights the technological advantages of SiP technology as an antenna friendly, simple, and cost-effective process.

For a 15 GHz RF carrier in air, the length L of the planar monopole antenna in air is initially set to a quarter of the carrier wavelength ( $L = \lambda_o/4$ ). While the antenna length should be 5.800 mm in free space, simulations show that for the SiP process it needs to be is 8.125 mm at 15 GHz. In theory for optimal operation, the ground plane should extend

to at least a quarter wavelength. However, the ground plane is shortened to 1.330 mm to reduce the size and cost of the fabricated antenna. Because the ground plane extends to less than the quarter wavelength necessary for proper operation, the length of the antenna is increased [77]. Simulations show that this smaller than ideal ground will reduce antenna gain by 0.368 (28 %).

To reduce the overall size of the design, the antenna layout in an "S" shape. The antenna will have a more uniform horizontal radiation pattern at the cost of reduced gain in the direction of maximum directivity (yz plane). The simulated gain of the antenna is shown in Fig. 6.2. Antenna gain defines how directional the antenna is in a given direction compared to an isotropic antenna, but also takes radiation losses into account. The gain has the expected donut shape. The peak gain is 0.95 (-0.22 dBi), approximately three times lower than an ideal monopole antenna. An ideal monopole built using SiP stack shown in Fig. 6.1b would have a far-field peak gain of 2.776 (4.41 dBi) based on simulation. In this work, the antenna is used as a near field coupler. While we provided a 3D gain figure, it is only applicable in the far field only. Moreover, since the antenna is folded, there is a strong interaction between antenna elements. Field distribution should be studied and is suggested as a future work.

To meet the design rules of the commercial SiP fabrication process, particularly as it relates to the permissible metal density and spacing, small holes of  $4 \mu m^2$  are uniformly distributed over the metal layers (Fig. 6.1a). The size of the holes, smaller than  $1/10^{th}$  of the antenna wavelength, will not affect the performance as the antenna can be considered a



**Fig. 6.2:** The simulated donut-shaped gain of the antenna perpendicular to the plane of the antenna. This gain is valid in the far region of operation.

continuous surface.

The antenna is fed through a GSG probe for measurement purposes. In real applications, a photodetector replaces the GSG probe where the proximity of the PD to the antenna mitigates the need to have a transmission line and a matching network that tend to be lossy. In the measurements described next section, the body metal of the probe impacts the frequency response of the antenna and the radiation pattern due to the metal body being close to the antenna. To account for any variation in frequency due to the body of the probe or process variations of the SiP process, the symmetry in the measurement setup described next is maintained. The proposed antenna operates in the radiative near

field region. As such, conductors (e.g., GSG probes or other antennas) in the near field could lead to absorption of energy (signal leakage) and crosstalk. This will manifest itself as a change in the impedance of the antenna and a detuning of the operation frequency. Consequently, minimizing conductors in the near field region or implementing impedance adjustment circuits, as in [78], may be required for optimum antenna impedance matching.

## 6.3 Experimental validation

The antennas are measured in two steps. First, the S-parameter measurements are taken using a vector network analyzer (VNA). Second, a PD is used as a transmitter with PRBS data transmission demonstrated. The calculated length defining the boundary between near-field and far-field is 6.6 mm [79]. Thus, S-parameters measurements characterize the coupling between antennas in the near-field. Antenna gain, as reported in Fig. 6.2, is relevant for far-field distances beyond 6.6 mm.

S-parameters are measured by landing two identical GSG probes on two antennas. Identical probes were chosen for these measurements so that any variation in the frequency response or the radiation pattern of the antenna pair due to the probes are matched. The antenna pair is placed at a distance of 0.3175 cm (0.125") from each other and aligned in the direction of maximum directivity in the y-axis as indicated by the gain pattern shown in Fig. 6.2. The antennas are arranged on the vacuum waveguide mount that uses air suction to hold them in place when landing the probes for measurements. Measured and

simulated S11 (reflection coefficient) and S21 (forward transmission coefficient) curves are shown in Fig. 6.3a. Both curves show general agreement with a frequency shift of 0.1 GHz for S21 and 0.3 GHz for S11 for the resonance frequency likely due to the GSG probes used for the measurement. S-parameter simulations consider only the second antenna placed in a vacuum without considering the presence of other probes and circuits in the near field of the antenna during measurements. Further, the HFSS simulation were limited by memory and processing power leading to inaccurate S21 prediction away from resonance. The measured 10 dB bandwidth of the antenna (S11 < -10 dB) is 600 MHz. The antenna shows a resonance at approximately 15 GHz and has an S21 peak of -16 dB. The transmission peak (S21) is measured for various distances as shown if Fig. 6.3b.



**Fig. 6.3:** (a) Measured and simulated (Sim) S-parameters with the antennas placed 0.3175 cm (0.125") apart. (b) Measured peak S21 at 15 GHz at three different distances.

In the second measurement step, the setup shown in Fig. 6.4 is used. to demonstrate inter-chip data transmission. For simplicity, a homodyne transmission scheme is selected.

In this setup, a 1550 nm continuous wave (CW) is modulated with a Mach-Zehnder modulator (MZM). The input to optical modulator is a 15 GHz RF carrier generated by a clock synthesizer (CLK 1, Anritsu 69377B) modulated with a 650 Mbps PRBS 7 data pattern generated by a programmable pattern generator (PPG, Keysight N4903B) using an RF mixer (Marki M90750) with 7.5 dB conversion loss. A relatively low bit rate is chosen as the antennas are narrow band. The carrier frequency is chosen to be 15 GHz as per the maximum S21 measured (Fig. 6.3a). The output of the optical modulator (MZM) is amplified using an erbium-doped fiber amplifier (EDFA). The output of the EDFA is filtered with an optical filter centered around 1550 nm with an optical bandwidth (3dB) of 0.69 nm. The output of the filter is connected to a variable optical attenuator (VOA) to control the optical power fed to an off-chip photodetector (Finisar XPDV2120R-VF-FA). The output of the VOA is fed to the PD through a 10/90 coupler. The 10 % output is monitored by an optical power meter. The photodetector has a reported typical responsivity of 0.65 A/W. The PD is internally matched to 50  $\Omega$ . This ensures proper matching between the PD and the antenna for maximum power transfer. On the receiver side, the output of the antenna is mixed with the same carrier frequency generated from CLK 2. The output of the mixer is amplified with an RF amplifier with 52 dB of gain and a bandwidth of 45 GHz. The signal is then filtered with a commercial low-pass filter with a cut-off frequency of 500 MHz. This is used to suppress frequency content at 30 GHz generated on the receiver side due to demodulation and any 15 GHz content due to clock

feedthrough. Finally, the output is connected to an error detector (ED) and a sampling oscilloscope.



**Fig. 6.4:** Measurements setup for inter-chip communication with an external photodetector as a transmitter directly driving the antenna. GSG probes were used to drive the antennas as shown above the antenna pair symbol.

The measured bit-error-rate (BER) curves for 0.3175 cm (0.125") and 0.635 cm (0.25") are shown in Fig. 6.5. The antenna achieves a BER of  $10^{-10}$  at a distance of 0.3175 cm. The BER worsens with the distance between the two antennas, to approximately  $10^{-4}$  at 0.635 cm for the same input power. This relatively higher BER is because of the weak driving capability of the PD used directly as a transmitter without further pre-amplification to derive the antenna. In more conventional systems, a power amplifier is used to drive the antenna. Alternatively, high-power PDs could be used [65–67]. Another contributing factor is the smaller ground plane size than ideal that degrades the gain of the antenna. The captured and amplified eye diagram (Fig. 6.5) has an amplitude of 430 mV at 0.3175 cm and 270 mV at 0.635 cm, both for an input optical power of 10 dBm. At 1.27 cm (0.5"),

the antenna is too far to allow for BER measurement. The ED fails to synchronize when the BER is above  $10^{-3}$ . The BER is limited to  $10^{-10}$  in the first case because the maximum power allowed at the PD is 10 dBm.



**Fig. 6.5:** Inter-chip antenna measured BER and eye diagrams curve for two different distances of antenna.

## 6.4 Discussion

Important considerations of PD-antenna systems in SiP include area, gain, data-rate, and optical power needed for transmission. In terms of area and gain, at higher frequencies of operation such as the mm-wave frequencies of 60 GHz and higher, the size of the antenna can be reduced, and antenna arrays become feasible. Such arrays can be used to achieve

better gain for more efficient inter-chip communication. It is also possible to bend the antenna as done in this design to reduce area, but with an adverse impact on gain. The data rate is limited by the bandwidth of the antenna which can be increased by employing wider band structures, or higher frequency of operation. There are applications such as clock distribution [80] and wireless tagging [81] where low bandwidth is sufficient.

In terms of optical power, the minimum optical power required for RF transmission is estimated next. The transmitting antenna is driven directly with an RF generator while connecting the receiving antenna to a spectrum analyzer. At 0.3175 cm, the RF generator needs to produce -27 dBm RF power for the signal to be distinguishable from the noise floor of the spectrum analyzer on the receiving end. The RF power,  $P_{RF}$ , is related to the RMS current,  $I_{RMS}$  through the input impedance of the antenna, Z, by  $P_{RF} = (I_{RMS})^2 \times Z$ . This gives an RMS current of 200  $\mu$ A. The corresponding optical power,  $P_O$ , is then calculated using the responsivity of the PD  $(P_O = I_{RMS}/R)$ . If a photodetector with a responsivity of 0.75 A/W were to be used, a minimum optical power of -5.75 dBm would be needed. If a PD with lower responsivity were to be used, then higher optical power would be needed (e.g. -3 dBm for R = 0.4 A/W). Note that for practical data transmission, higher power would be needed as indicated by the BER measurements (Fig. 6.5). Circuits such as a low noise amplifier on the receiver side or a power amplifier on the transmitter side can extend the communication distance between the two antennas. One can consider an equalizer to increase the limited bandwidth of the antenna.

## 6.5 Conclusion

In summary, this chapter presented a 15 GHz monopole antenna in a commercial SiP. The antenna benefits from the high resistivity of the substrate of SiP to achieve improved performance. S-parameter measurements and inter-chip data transmission were both measured and demonstrated. Inter-chip communication is demonstrated while using a photodetector directly as a transmitter driving the antenna without RF amplification. Good matching (S11 < 10 dB) and peak gain (S21 = 16 dB) were achieved at 15 GHz. The simulated gain is 0.95 (-0.22 dBi). This validation points towards the possibility of developing monolithic photonic emitters in SiP consisting of an antenna and a photodetector.

As the silicon photonic process is relatively cheap, and the potential monolithic integration of PD and antenna eliminates the need for matching networks, the proposed concept could prove to be a cost-effective, area effective solution for inter-chip and RoF communication applications.

An exciting research direction is using III-V technology to integrate the antenna, the photodetector, and the potentially the modulator.

# Chapter 7

# Integrated RF Passive Low-Pass

# Filters in Silicon Photonics

This chapter continues the theme of passives in silicon photonics and presented an integrated low-pass RC filter driven by a photodetector. Design and measurement methodologies are detailed and measurements are compared to simulations values. The work presented in this chapter has been published as a journal paper in the IEEE Photonics Technology Letters [9].

## 7.1 Introduction

Passive radio frequency (RF) elements such as inductors, capacitors, and resistors are necessary in high-speed optical transceivers. However, the bulky nature of passive RF components prevents further miniaturization of the RF chips for cost-effective high-speed

applications. This limitation can be partially overcome by using off-chip components at the cost of a reduced level of integration and additional parasitics from the package. With the increased importance of RF passive elements in applications such as oscillators [82], passive equalizers [83], and data serializers / deserializers [18], it is worth considering alternative integration schemes with other optical/electrical components.

The monolithic integration of electronic and photonic components on silicon is the most advantageous solution to minimize parasitics between the optical and electronic circuits. Electronic circuits have been implemented in silicon photonic fabrication processes [4,84] but the transistors used have been limited to the 130 nm and 90 nm nodes. Another approach to achieve monolithic integration with higher performance electronic nodes has been to build optical devices with almost no modification to the electronic fabrication process [85]. However, the trade-off in this case is a reduction in the performance of the optical devices. Therefore, hybrid integration enables the use of state-of-the-art photonic integrated circuits (PICs) and complementary metal oxide semiconductor (CMOS) integrated circuits (ICs) in the same system [86]. Furthermore, the parasitics at the interface between the chips can be minimized with advanced system-in-a-package technologies, such as flip-chip bonding [87].

Silicon photonics provides a potentially cost-effective platform for the integration of photonic components with RF passive elements. This offers great opportunities to develop high-speed transceiver modules by co-packaging the PIC and the CMOS IC [88].

Furthermore, as processing speeds increase, one issue is the increased cost of integrating passive RF components on the CMOS chip as their size relative to the transistor circuit becomes significantly larger. This increase in cost of fabrication is especially noticeable for smaller CMOS technology nodes. For example, the cost per  $mm^2$  of STMicroelectronics 28 nm fully depleted silicon-on-insulator CMOS process is at least 17 times more expensive than that of Advanced Micro Foundry (AMF) Silicon Photonics technology (prices provided by CMC Microsystems) [20]. Moreover, since the typical minimum feature size in silicon photonics is above 100 nm, the fabrication can be done with less advanced photolithography tools than high-speed CMOS circuits. In addition to the cost advantage, silicon photonics benefits from a high-resistivity substrate. It has been shown that high-resistivity substrates facilitate the suppression of substrate noise and crosstalk and increase the quality factor (Q) of RF passive components [89–92].

This chapter presents the potential use of silicon photonics to implement RF electrical passives integrated elements by demonstrating three variations of an RC low pass filter (LPF) monolithically integrated with a photodiode (PD).

This proof-of-concept validates the potential for cost-effective receiver front-end designs. The model, the design strategy, and fabrication process are presented in the subsequent sections. The scattering parameters (S-parameters) of the electrical signal at the output of the photodiodes are measured and analyzed. Finally, an equivalent circuit is used to validate the behavior of the fabricated devices for parameter extraction.

### 7.2 Circuit model and design parameters

Fig. 7.1 presents a schematic diagram of a lumped model for the PD and the LPF. The PD, the filter resistor, and the filter capacitor are connected in a parallel configuration. In the circuit model, the LPF has a designed resistance  $R_f$ , and a designed capacitance  $C_f$ . The current source  $I_{PD}$  models the photocurrent. At a reverse bias voltage of 2 V, the PD has a junction capacitance  $C_j$  and a series resistance  $R_s$  which are around 35.2 fF and 85  $\Omega$ , respectively [59]. The metallic pads of the PIC add parasitics represented by the capacitance  $C_{pad}$ , which is about 15.2 fF for a pad size of  $70 \times 70 \ \mu m^2$  [59]. From simulations, the estimated value of the pad resistance  $R_{pad}$  is 4  $\Omega$ , and the inductance  $L_{pad}$  is 0.17 nH. The load resistor  $R_L$  models the 50 measurement equipment termination resistance.



**Fig. 7.1:** Schematic of the circuit model for the designed RC low pass filters. Dashed boxes outline the PD and LPF configurations.

The 3 dB bandwidth  $(f_{3dB})$  of the receiver front-end is determined by the poles estimated using the open circuit time constant approach. This method is an approximate analysis for the estimation of the cut-off frequency of the electronic circuit. It estimates the cut-off

frequency by summing the RC time constant of all the capacitors in the circuit [93]:

$$C' = C_f + C_{pad} \tag{7.1}$$

$$f_{3dB} = \frac{1}{2\pi(\tau_1 + \tau_2)} \tag{7.2}$$

$$\tau_1 = C'(R_F || (R_L + R_{pad})) \tag{7.3}$$

$$\tau_1 = C_i(R_s + R_F || (R_L + R_{pad})) \tag{7.4}$$

where  $\tau_1$  and  $\tau_2$  are the time constants associated with C' and  $C_j$ , respectively, such that  $C' >> Cj \to \tau_1 + \tau_2 \approx \tau_1$ . As can be seen, the LPF parameters,  $C_f$  and  $R_f$ , are dominant parameters in determining the 3 dB bandwidth of the filter.

The designed silicon photonics receiver front-end is fabricated on a SOI wafer with a 220 nm silicon device layer and a 2  $\mu$ m buried oxide layer, and a silicon substrate of 725  $\mu$ m with a resistivity greater than 750  $\Omega$ .cm. The SiGe vertical PDs have a thickness, width, and length of 0.5  $\mu$ m, 8  $\mu$ m, and 31  $\mu$ m, respectively. The PDs consist of a highly doped n-type germanium layer, an intrinsic germanium layer, and a p-type silicon layer [94]. The dimensions of the plates for the metal-dielectric-metal (MIM) capacitors were estimated by

using the design strategy detailed in [95]. The capacitors consist of two aluminum layers, corresponding to Metal 1 and Metal 2 in the fabrication process, that are 0.75  $\mu$ m and 2  $\mu$ m thick, respectively. A 1.5  $\mu$ m thick SiO<sub>2</sub> layer is sandwiched between the metal layers to form the capacitor dielectric. Fig. 7.2a shows the cross-section view of the MIM capacitor. The resistors are formed on a 0.09  $\mu$ m thick n-type silicon layer with a doping density of at least 1020  $cm^{-3}$  [96]. Fig. 7.2b shows a micrograph of one of the implemented LPF structures integrated with a p-i-n PD.



**Fig. 7.2:** (a) Cross-section view of the MIM capacitor and layout view of the LPF structure (GC: grating coupler, PD: photodiode, M1: Metal 1, M2: Metal 2). (b) Micrograph of the RC filter structure in an active silicon photonics process

To achieve cut-off frequencies ranging from 1.5 to 2.7 GHz, the target values for the resistors are 160  $\Omega$  and 400  $\Omega$ , and for the capacitors, they are 1.35 pF and 2.34 pF. The estimated length and width of the metal plates for a 1 pF capacitor are 708  $\mu$ m and 67  $\mu$ m and for a 2 pF capacitor they are 708  $\mu$ m and 33.5  $\mu$ m. It has been shown that the fringing electric

fields on the perimeter of the integrated capacitors lead to additional capacitance in the microstructures [97]. The fringing field capacitance can be estimated using the ANSYS HFSS 3D electromagnetic field simulator. Based on the HFSS simulation results, the expected capacitance values including the effect of the fringing field capacitance for the 1 pF and 2 pF MIM capacitors are of 1.35 pF and 2.34 pF, respectively.

The LPF resistor values were designed using the mathematical relationship  $\rho l/A$ , where  $\rho$  is the resistivity of the doped region, l is the length of the resistor, and A is the crosssection area of the resistor. Assuming a linear behavior of the integrated resistor at 300 K, a doping density of 1020  $cm^{-3}$ , and a resistivity of  $7.2\times10^{-4}\Omega$ .cm [98], a 160  $\Omega$  resistance is obtained by choosing a length of 10  $\mu$ m and a width of 5  $\mu$ m for the doped region. Similarly, a length of 25  $\mu$ m with the same width leads to a 400  $\Omega$  resistance.

Using the lumped model shown in Fig. 7.1, simulations with the Advanced Design System (ADS) software from Keysight were performed to fits the S-parameter curves obtained through the experimental measurements. Table 7.1 summarizes the parameters of the circuit model. As the junction capacitance  $C_j$  is much smaller than the filter capacitance  $C_f$ , the parameters of the filter defines the location of the dominant pole. The PD and pad parameters have negligible effects on the overall performance of the filter. LPF1, LPF2, and LPF3 are associated with the designed values of R1=400  $\Omega$  and C1=1.35 pF, R2=160  $\Omega$  and C2=2.34 pF, and R3=400  $\Omega$  and C3=2.34 pF, respectively. Using equations 7.1 to 7.4, the expected 3-dB cut-off frequencies are 2.65 GHz, 1.78 GHz.

and 1.53 GHz, respectively.

Table 7.1: equivalent circuit parameters

| Parameter | Value         |  |
|-----------|---------------|--|
| $R_s$     | $85 \Omega^1$ |  |
| $C_J$     | $35 \ fF^{1}$ |  |
| $C_{PAD}$ | $15 \ fF^{1}$ |  |
| $R_{PAD}$ | $4~\Omega^2$  |  |
| $L_{PAD}$ | $17 nH^2$     |  |
| $R_L$     | $50 \Omega^1$ |  |

### 7.3 Experimental results

Three chips were tested, and were numbered 1, 2 and 3 for reference. The implemented resistances were directly measured on the chip using an RF Ground-Signal-Ground (GSG) probe and an ohmmeter. Because of the presence of the PD junction capacitance and other stray capacitors in the structure, direct measurement of the capacitance on the chip does not give an accurate value of the filter capacitance at a specific frequency. Thus, the impedance fitting technique with a lumped model was used in the ADS software to extract the capacitance. Measurements of the S-parameters were performed with a 50 GHz lightwave component analyzer (Agilent N4373C) with a 2-V reversed bias applied to the PDs. The effects of the RF cables and probe tip were removed from the measurements by following a procedure relying on a calibration kit and a calibration substrate.

<sup>&</sup>lt;sup>1</sup>Values reported in [59].

<sup>&</sup>lt;sup>2</sup> Values extracted from simulations.

The lightwave component analyzer was used to perform optical-electrical measurements by generating modulated light that is coupled to the chip. The optical signal travels through the waveguide on the PIC to the PD. Then, the PD converts the optical signal into an electrical one that the analyzer measures and from which it extracts the  $S_{21}$  value that characterizes the optical-electrical conversion. To match the simulation and experimental results of the  $S_{21}$  parameters, the optimization performs a parameter sweep for the lowpass filter capacitance  $C_F$ . To this end, a goal is defined based on the measured cut-off frequencies. Fig. 7.3 shows the measured and simulated  $S_{21}$  parameters for chip no. 3, after optimization of the model. The simulated cut-off frequencies using the equivalent circuit with the optimized parameters match the experimental values. Fig. 7.4 compares the measured and simulated  $S_{22}$  parameters (output return loss) for chip no. 3 in a Smith chart format giving a polar representation of the reflection coefficients. At any given frequency, both the magnitude and phase information from the experiment on the fabricated structures and their impedance-matched models can be derived from the chart. The results represented in Fig. 7.4 confirm the validity of the lumped model component values after performing the impedance fitting. The measured return loss is in good agreement with the return loss obtained from the lumped model (Fig. 7.1) over the whole frequency range. Assuming a characteristic impedance of  $Z_o = 50 \Omega$ , the measured normalized impedance (solid red plot),  $z/z_o$ , at the cut-off frequency of 1.56 GHz for LPF2 on chip no. 3 is 0.523-j1.063 and the normalized impedance derived from the model (dashed red plot) at the cut-off frequency of 1.67 GHz is 0.448-j0.928, which shows a good impedance correspondence between the lumped model and the implemented structure.



Fig. 7.3: Measured and fitted  $S_{21}$  parameter using the lumped model shown in Fig. 7.1 and the parameters values in Table 7.1 for chip no. 3 at 2 V reverse bias.

Table 7.2 compares the experimental values of  $C_F$  and  $R_F$  with the target design values for three chips from the same wafer. The target design values are shown in the brackets.

**Table 7.2:** Measured versus designed capacitance (pF) and resistance ( $\Omega$ )

|      | LPF1  |        | LPF2  |        | LPF3  |        |
|------|-------|--------|-------|--------|-------|--------|
| Chip | R1    | C1     | R2    | C2     | R3    | C3     |
| Cmp  | (400) | (1.35) | (160) | (2.34) | (400) | (2.34) |
| 1    | 416   | 1.49   | 168   | 2.55   | 412   | 2.31   |
| 2    | 427   | 1.45   | 171   | 2.45   | 421   | 2.61   |
| 3    | 407   | 1.44   | 165   | 2.62   | 406   | 2.60   |

From these results, it can be inferred that the experimental results are up to 12~% larger



Fig. 7.4: Measured and fitted  $S_{22}$  parameter using the lumped model shown in Fig. 7.1 and the parameters values in Table 7.1 for chip no. 3 at 2 V reverse bias.

than the designed values. The uneven thickness of the oxide between the capacitor metal plates due to the fabrication process variations can be the main cause of difference. An oxide thickness variation of at least 10 % is expected on the fabricated chips. Furthermore, variation in the dimensions of the metal plates during the metallization process is another factor that plays a role in the observed difference in the capacitance values. The differences in the resistance between dies are attributed to variations in the resistor doped area dimensions after fabrication and in the doping density of the silicon. There are small discrepancies between the designed and measured resistance values that are due to the effect of the pad resistance and fabrication process variations. In particular, a lower doping density of the

n-Si [99] and a smaller Si thickness [100] in the resistor areas can lead to a larger resistance. While insufficient data is available to perform an accurate statistical analysis to compare these results with their CMOS counterparts, the presented results for three chips show small chip-to-chip variations for the resistors and capacitors on the PICs. It should be noted that, chip-to-chip variation in the capacitor and resistor values in CMOS processes are significant [101]. For instance, simulation results in a 65 nm CMOS technology show a  $\pm 10$  % and  $\pm 30$  % chip-to-chip variation in the values of MIM capacitors and silicided poly resistors, respectively.

Table 7.3 summarizes the cut-off frequencies that resulted from the measurement and the lumped model in comparison with the designed values for the three different chips. There is a good agreement between the experimental results for the different dies. However, the measured cut-off frequencies for three instances of LPFs are slightly less than the expected cutoff frequencies from the designed values. These discrepancies are likely caused by an increase in the effective value of passive elements (e.g., capacitors) resulting from fabrication process variations and fringing fields as discussed earlier.

**Table 7.3:** Filter cut-off frequencies (GHz).

|         | LPF1                 |         | LPF2                 |         | LPF3                 |             |  |
|---------|----------------------|---------|----------------------|---------|----------------------|-------------|--|
|         | Desig                | gn:2.65 | Design:1.78          |         | Desi                 | Design:1.53 |  |
| Chip ID | $\operatorname{Fit}$ | Measure | $\operatorname{Fit}$ | Measure | $\operatorname{Fit}$ | Measure     |  |
| 1       | 2.65                 | 2.67    | 1.69                 | 1.61    | 1.38                 | 1.41        |  |
| 2       | 2.69                 | 2.72    | 1.7                  | 1.65    | 1.22                 | 1.24        |  |
| 3       | 2.71                 | 2.75    | 1.67                 | 1.56    | 1.21                 | 1.23        |  |

#### 7.4 Conclusion

In this chapter, an RC LPFs were implemented on a PIC as a case study to show that bulky passive RF components in a receiver front end can be built with silicon photonics. To investigate the performance of integrated RF components on the PIC, s-parameters for three RC filters were analyzed. A circuit model was used to evaluate the design strategy. The capacitor values of the filters were extracted using the ADS optimization tool. Considering the effect of fringing field capacitance, the extracted values from the experiment are in agreement with the expected values. This agreement shows that this approach is practical in building LPFs in SiPh. Furthermore, chip-to chip variations comparable to that of CMOS IC designs were observed. This validates the circuit model and demonstrates the feasibility of implementing passive components on PICs.

## Chapter 8

# A High-speed Moving Average

# Integrator in SiP for TIA-less

# Receivers

This chapter concludes the theme of passives in silicon photonics and presents a moving average filter. This is done by combining optical delay lines, photodetectors, and capacitors. Design and measurement methodologies are detailed and measurements are compared to simulations values. The work presented in this chapter has been accepted for publication as a journal paper in the IEEE Photonics Technology Letters [10].

#### 8.1 Introduction

Silicon Photonics (SiP) plays a key role in addressing the ever-increasing demand for optical transceivers with high bandwidth and low power consumption for future data centers. Indeed, SiP technology provides a unique platform to design the next generation of optical interconnects by harnessing well-developed CMOS fabrication technologies. One of the bottlenecks in the design of transceivers is the high-power consumption of the CMOS receiver front-end (i.e. approximately 22 % of the power budget) [36, 47, 48]. Integrated receiver front-ends emerged recently aiming to reduce power consumption [18, 34]. They use either a photocurrent integrator or transimpedance amplifier (TIA) followed by sampling capacitors and comparators [5, 6].

There are several challenges involved in developing integrating receiver front-ends. The power consumption of a high-gain TIA and the clock-phase generation circuit in 65 nm CMOS node can be beyond 42 % of the power budget of the CMOS receiver front-end [34]. Furthermore, these receivers use a capacitor to integrate the photocurrent and convert it into voltage. The accumulated voltage is amplified then sampled using an analog to digital converter. Finally, the capacitor is reset using a short pulse. Using a pulse reset degrades sensitivity because it reduces the integration time [50] or requires large integrating capacitances [52]. Furthermore, this reset pulse is prone to process variation and can be difficult to achieve at high-speed, increasing the strain on the CMOS technology node.

The other reported problem in the design of optical receivers is charge sharing between

the integrating capacitor and sampling capacitor detailed in [9]. The charge sharing issue further deteriorates the sensitivity of the optical receiver and increases the minimum junction capacitance that the photodiode must have and eventually limits its speed [9].

In the previous chapter, the feasibility of implementing passive electrical components such as resistors and capacitors in silicon photonics, as a cost-effective photonic-electronic co-design approach, was demonstrated [9]. In this chapter, a moving average integrator is monolithically combined with an optical receiver structure in silicon photonics. With this approach, not only is the charge sharing issue resolved, which leads to enhanced sensitivity, but also there is no need for generating a short duration reset pulse. Furthermore, this structure enables a photonic-electronic co-design approach for data communications that does not need a complex clock-phase generation circuitry and a TIA on the CMOS chip (TIA-less) and, hence, significantly improves the power consumption of the receiver.

#### 8.2 Design methodology

A schematic diagram of the moving average integrator is shown in Fig. 8.1. The structure is a proof-of-concept of its feasibility and is implemented in silicon photonics targeting four-bit integration at 10 Gbps. To do so, the light is split into two branches using a directional coupler. One path is delayed by 400 ps relative to the other path. The optical power at the photodiodes is matched by adjusting the coupling ratio of the directional coupler. Finally, a capacitor is used to integrate the current over a period of 400 ps. The voltage across the

capacitor is simply the integral of the current difference, divided by the capacitance C, as given by the following equation:

$$V_{OUT} = \frac{1}{C} \int (i_{PD2}(t) - i_{PD1}(t))dt = \frac{1}{C} \int (i_{PD2}(t) - i_{PD2}(t - 400ps))dt$$
 (8.1)

where  $i_{PD1}$  and  $i_{PD2}$  are the photocurrents generated by PD1 and PD2, respectively.



**Fig. 8.1:** Schematic diagram of the moving average structure and the cross-section of the MIM capacitors in the silicon photonics process.

The delay line consists of a low-loss rib silicon waveguide with a width of 3  $\mu$ m and a core thickness of 220 nm surrounded by a 90 nm thick slab, as depicted in the inset of Fig. 8.1. Fig. 8.2 shows the group delay (ps/mm) of the low-loss waveguide calculated with Lumerical MODE. For a group index of 3.72 at the operating wavelength of 1550 nm, a

32.3 mm optical delay line creates a 400 ps delay between the two channels. The variation of the time delay in the wavelength range from 1530 nm to 1565 nm is approximately 5 ps, which can impose a deterministic timing jitter to the output signal but has a negligible effect on the performance of the proposed moving average structure for data rates up to 10 Gbps. Considering the typical loss of approximately 0.7 dB/cm in 3  $\mu$ m × 220 nm waveguides, the directional coupler must compensate for 2.3 dB of additional loss caused by the delay line in channel 1. For a gap of 200 nm, the length of the rib-directional coupler consisting of 500 nm × 220 nm waveguides is found to be 5  $\mu$ m through Lumerical FDTD simulations [96]. The calculated length provides a cross coupling ratio of  $\kappa^2 = 0.63$ , and a through coupling ratio of  $t^2 = 0.37$ , which provides a balanced power at the output ports. Two SiGe integrated photodiodes convert the optical signals to electrical currents. The adjacent photodiodes are connected through a 200 fF capacitor. The converted data and its delayed version are integrated by the capacitor.

The design methodology for the MIM caps (Fig. 8.1) in the silicon photonic platform is detailed in [95]. To achieve a 200 fF capacitance in the silicon photonic process, the dimensions of the metal plates must be 100  $\mu$ m × 94  $\mu$ m according to HFSS simulations. The thickness of the Metal 1 layer is 0.75  $\mu$ m and that of Metal 2 is 2  $\mu$ m. An oxide layer of 1.5  $\mu$ m is sandwiched between the two metal layers as illustrated in the inset of Fig. 8.1. The chip was fabricated at the A\*STAR Institute of Microelectronics (IME). The fabricated chip is mounted and wire bonded to a PCB pads with a pitch of 2.54 mm to allow



Fig. 8.2: Calculated group delay per mm for 3  $\mu$ m× 220 nm low-loss rib waveguides used for the delay (DL). The inset shows a simulation of the cross-section of the fundamental mode.

the use of a differential high-impedance probe for the measurement of the output signal. A high-impedance probe ensures an RC time constant much greater than the unit interval at speeds up to 10 Gbps, which is essential for signal integrity and the observation of the moving average operation. Fig. 8.3 shows a micrograph image of the photonic chip mounted on the PCB. The implemented structure performs one-bit integration at 2.5 Gbps, two-bit integration at 5 Gbps, and four-bit integration at 10 Gbps. There are two distinguishable decision levels in the output signal at 2.5 Gbps, three levels at 5 Gbps, and five levels at 10 Gbps. The electrical circuit (e.g., in CMOS) can be wire-bonded to the SiP chip to recover the transmitted data. The CMOS receiver front-end will be a TIA-less circuit without a reset

function and a multi-level decision circuit. The operation of the proposed moving average structure is verified through simulations and experiments.



Fig. 8.3: Micrograph of the wire-bonded silicon photonic die on the PCB with an enlarged view of the fabricated moving average structure. GC: grating coupler; DC: directional coupler; DL: delay line; PD1, PD2: photodiodes; G: ground pads; S: signal pads; V: DC voltage pads;  $C_f$ : integrating capacitor.

### 8.3 Data recovery principle

Fig. 8.4 shows a schematic diagram of the physical model of the moving average structure. The effect of the bond wires is neglected for length shorter than 1 mm, which results in negligible parasitics. In Fig. 8.5a, there are two distinct decision-making levels ( $j_1$  and  $j_2$ ) for data recovery at 2.5 Gbps by the TIA-less receiver. On the other hand, in Fig. 8.5b,

there are three levels  $(m_1-m_3)$  at 5 Gbps, and in Fig. 8.5c, five levels  $(n_1-n_5)$  at 10 Gbps are distinguishable. For each of the observed voltage levels, a voltage reference is specified. The output signal is then compared with the references while considering the preceding recovered bit(s) for the decision-making, unless proper encoding and decoding are employed [56]. For example, at 2.5 Gbps, receiving an output signal at the voltage level of  $j_1$  corresponds to receiving a binary '0' in the previous unit interval (UI). Whereas, an output voltage at the level of  $j_2$  is equivalent to receiving a binary '1' in the previous UI. At 5 Gbps, level  $m_1$  corresponds to '00',  $m_2$  corresponds to '01' and '10', and  $m_3$  corresponds to '11' in the previous two UIs. At 10 Gbps, level  $n_1$  represents the bit stream '0000', level  $n_2$  demonstrates '0001', '0010', '0100', and '1000', and level  $n_3$  corresponds to '0011', '0110', '1110', '1101', '1011', and '1101' and level  $n_5$  is related to '1111' in the previous four-unit intervals.

For some of the aforementioned levels, more than one-bit stream sequence leads to the same level for two-bits integration and above. Proper logic in the associated electronics can recover the specific incoming bit stream without considering previous bits if the proper encoding and decoding are employed [56]. For example, for a four-bit integrating front-end, if the data, D, are encoded into a bit pattern, B, on the transmitter side according to:

$$B(x) = D(x) \oplus B(x-1) \oplus B(x-2) \oplus B(x-3)$$

$$(8.2)$$

then the receiver will receive the sum,  $B_s$ , of four bits at the front-end:



Fig. 8.4: Schematic diagram of the moving average structure including the parasitics and the high-impedance probe input resistance where  $R_L = 100 \ k\Omega$  is the high-impedance probe input impedance,  $C_f = 200 \ fF$  is the moving average integrator,  $C_{j1} = C_{j2} = 15 \ fF$  are the photodiode junction capacitances,  $R_{s1} = R_{s2} = 125 \ \Omega$  are the photodiodes series resistances, and  $= C_{pad1} = C_{pad2} = 15 \ fF$  are the pads capacitance.



**Fig. 8.5:** Simulation results for random sequences (a) at 2.5 Gbps with two decision-making levels of  $j_1$  and  $j_2$ , (b) at 5 Gbps with three decision-making levels of  $m_1$ ,  $m_2$ , and  $m_3$ , and (c) at 10 Gbps with five decision-making levels of  $n_1$ ,  $n_2$ ,  $n_3$ ,  $n_4$ , and  $n_5$ . The insets show the corresponding simulated eye diagrams.

$$B_s = B(x) + B(x-1) + B(x-2) + B(x-3)$$
(8.3)

The receiver can then recover the data pattern, D, on the receiver side through a simple modulo-2 operation:

$$D(x) = B_s mod 2 (8.4)$$

#### 8.4 Experimental procedure and results

Fig. 8.6 illustrates the experimental setup used for the characterization of the moving average integrator. A tunable laser provides the optical signal at 1550 nm. Two polarization controllers adjust the state of polarization at the input port of the modulator and before coupling to the grating coupler on the optical chip. An optical power of 8 dBm is incident on the chip throughout the experimental validation. A power measurement on back-to-back grating couplers shows an insertion loss of 5 dB for a single grating coupler. The output signal is measured with a high-impedance differential probe head (N5445A InfiniiMax III) compatible with a real-time scope (MSO-X 92504A from Keysight Technologies). A PRBS 7 electrical signal modulates the optical signal at the desired data rates of 2.5 Gbps and 5 Gbps.

The 400 ps optical delay implemented by the rib waveguide delay line is verified



Fig. 8.6: Experimental setup for the characterization of the moving average integrator.

experimentally with a GSG RF probe. Fig. 8.7a shows the observed 400 ps time delay between the channels. The experimental results show promising performance at 2.5 Gbps and 5 Gbps, matching the simulation results obtained in Cadence Spectre using the equivalent circuit shown in Fig. 8.4. Fig. 8.7b and Fig. 8.7c compare the experimental results to the simulation results at 2.5 Gbps and 5 Gbps. The noise observed in the experimental results is due to the mismatch in optical power between the two channels. Further investigation proved that the rib waveguides suffered from an average propagation loss of 2.5±0.5 dB/cm as a result of fabrication process variations.

Whereas, the directional coupler in the structure is designed to compensate approximately 0.7 dB/cm of loss in the optical delay line. The photocurrent observed in channel 1 (PD1) and 2 (PD2) are 224 and 525  $\mu$ A, respectively, using an RF probe with an input impedance of 50  $\Omega$ . These currents correspond to 11.8 and 5  $\mu$ A for the high-impedance probe with an input impedance of 100 k $\Omega$ . For the reported SiGe PD responsivity of 0.7 A/W, the received optical powers in channel 1 (PD1) and 2 (PD2) are 0.32 mW (-4.95 dBm) and 0.75 mW (-1.25 dBm), respectively, showing a 3.7 dB difference. This corresponds to a propagation loss of approximately 1.9 dB/cm in the delay line, which is in agreement with the reported



Fig. 8.7: (a) Experimental validation of the 400 ps optical delay line (length of 32.3 mm at 1550 nm) using 50  $\Omega$  impedance GSG probes. Experimental versus simulation results: the purple graphs are the measured output and the overlapped red graph is the simulated output at (b) 2.5 Gbps and (c) 5 Gbps.

values.

### 8.5 Conclusion

In this chapter, an electro-optic moving average integrator implemented with silicon photonics is demonstrated. Due to the reduced cost per unit area of the fabrication in silicon photonics process compared to state-of-the-art CMOS processes used in optical interconnects, monolithic integration of a sampling capacitor on silicon photonics to create a moving average integrator is a effective way to mitigate issues such as sensitivity degradation and high power consumption in conventional optical receivers. moving average integrators do not need a reset circuit, transimpedance amplification, and multiple-clock phase generation, which simplifies the CMOS design for data recovery, but at the cost of adding multi-level decision circuitry. The performance of the SiP moving average integrator was experimentally verified by using a high impedance probe. As a proof-of-concept for the feasibility of implementing such a structure, experiments at 2.5 Gbps and 5 Gbps were performed. The main limitation of this approach, however, is the accuracy of the required delay lines and the required matching of optical power at the outputs of the optical delay lines. If the coupling ratio of the optical power feeding each of the photodetectors through the delay lines is not carefully controlled, this structure will not function properly. More specifically, there will be voltage build-up on one side of the capacitor, which will eventually lead to saturation. Nevertheless, the outcome of this chip demonstrates the feasibility of an electro-optic moving average integrator implemented in silicon photonics for use in high sensitivity and energy efficient next generation optical receivers. Such hybrid receivers, leveraging both SiP and CMOS, are expected to be in high demand in future optical interconnects.

This concludes the second theme of this thesis.

## Chapter 9

# Proposed Future Work and

## Conclusions

In this chapter, some proposed research ideas that extend the thesis work are presented, followed by thesis conclusions.

## 9.1 Proposed future work

This section presents several future research directions to extend the work presented in this thesis.

#### 9.1.1 Final integration of photonic and electronic chips

The work presented in Chapter 3, Chapter 4, and Chapter 5 provide a proof of concept that is demonstrated using an electronic IC and off-chip mechanical delay lines. Thus, work still needs to be done to integrate the IC chip with silicon photonic delay lines. This requires some fine-tuning of the placement of the bond pads of silicon photonic chip to match that of the IC chip. Moreover, given the large size of the silicon photonic chip, it may be necessary to integrate the two chips using a PCB or an interposer as it is likely difficult to find a package large enough to fit both.

#### 9.1.2 Monolithic design

To take full potential of the proposed optical receivers, an implementation in a monolithic process is needed. This will boost performance and reduce the size. As discussed in Chapter 5, the input capacitance has a significant impact on the sensitivity of the receiver. Thus, in a monolithic process, where bond pads are not needed between the photonic side and the electronic side, the sensitivity is expected to improve. As the sensitivity improves, the gain stages can relax their gain and their power consumption is reduced making the receiver more energy-efficient.

#### 9.1.3 Clock recovery for the optical receivers

One challenge that remains to be addressed in the optical receivers is the clock recovery. There already exists a published clock and data recovery (CDR) that can be adapted and used for resettable receivers such as the one proposed in Chapter 5. This CDR is presented in [34]. Fig. 9.1 shows a conceptual block diagram of how that CDR would be employed in the proposed receiver in Chapter 5. To be adapted for our receiver and to avoid having to employ tunable optical delay lines or closing the CDR loop externally, the clock phase would be adjusted using a phase interpolator digitally controlled by an accumulator. From this, it would be possible to generate signals that are used to adaptively adjust the clock phase for proper sampling. The benefit of the delay lines is the simplification of the design of the digitally controlled oscillator that will be needed to generate only one clock phase.



Fig. 9.1: A CDR that could be used with the proposed receiver in [7] based on [34].

#### 9.1.4 A silicon photonics based delay locked loop

Finally, it might be interesting to develop a silicon photonics-based delay-locked loop (DLL) to be used with the receiver proposed in Chapter 5 as shown in Fig. 9.2. In this delay loop, which is based on [34], electronically tunable optical delay lines are used as the delay elements of the loop. This will allow the receiver to lock the phase between the data and the clock removing the need for external tuning or calibration.



Fig. 9.2: A DLL that could be used with the proposed receiver based on [34].

### 9.2 Conclusions

This thesis explored the potential and the benefits of the co-design of electronics and photonics that became necessary with the explosive traffic within data centers. After the

motivation, the thesis started with the background information and an overview of electronic and photonic delay lines. It was found that optical delay lines are capable of providing a wider range, better resolution, and lower power consumption compared to electronic delay lines. Then, the thesis presented the first theme covering the design of three optical receivers that leverage optical delay lines for clock generation.

The first optical receiver presented is a 12.5 Gb/s optoelectronic receiver in 65 nm CMOS that employs SiP delay lines to eliminate clock generation circuits and the associated buffers. The receiver was validated experimentally through electronic and optical testing. The receiver achieved a sensitivity of -4 dBm at a BER of 10<sup>-12</sup> while exhibiting an energy efficiency of 1.93 pJ/bit. As a large part of the power consumption of conventional receivers is due to clock generation and clock buffering, this technique has the potential of improving energy efficiency and removing complex circuits in quarter-rate systems.

The second receiver improves upon the second receiver by reducing the number of channels to two, reducing optical splitting losses, and replacing the common-gate transimpedance at the front-end with an improved gain, improved bandwidth transimpedance amplifier that has a resistor in series with an inductor in its feedback to boost bandwidth. Because this transimpedance amplifier has high gain, this receiver removes voltage gain stages allowing it to reduce power consumption. The full receiver was implemented in 65 nm CMOS. The receiver die and the photodetectors were mounted in a

QFN80 package and connected using bond wires. The receiver achieves a speed of 17 Gbps. The two-channel optical receiver has an energy consumption of 156 fJ/bit. The combination of a simplified clocking, signal amplification with only one gain improved TIA, as well as a dynamic comparator results in superior energy efficiency. Input sensitivity of -7 dBm OMA was achieved for this receiver without using any equalization technique.

In order to boost the speed of the previous two receivers, a novel reduced bandwidth optical receiver was proposed. In this receiver, the conventional front-end is replaced with a two-bit integrating front-end allowing the receiver to operate at data rates as high as 22 Gb/s. This receiver achieves an average sensitivity of -7.8 dBm and energy efficiency of 1.43 pJ/bit. The receiver aims to address some of the issues present in integrating receivers such as [35] and current-amplifier-based receivers such as [47], mainly charge sharing and short reset pulses without introducing a transimpedance amplifier circuit while avoiding the critical timing and complexity associated with DFE [36] and speculative DFE based receivers [54]. The proposed receiver shows great potential at higher speeds of operation when the clocking becomes more demanding and requires a duty cycle and quadrature error detection circuits. Such circuits are not needed in this system.

The second theme of this thesis deals with the implementation of passives, that are conventionally found on the IC side, or the printed circuit board, in silicon photonics stack. The first passive structure is a 15 GHz monopole antenna. The antenna benefits from the high resistivity of the substrate of SiP to achieve improved performance. S-parameter

measurements and inter-chip data transmission were both measured and demonstrated. This validation points towards the possibility of developing monolithic photonic emitters in SiP consisting of an antenna and a photodetector. As the silicon photonics process is relatively cheap, and as the potential monolithic integration of photodetector and antenna eliminates the need for matching networks, the proposed concept could prove to be a cost-effective, area effective solution for inter-chip and radio-over-fiber communication applications.

The second structure is the implementation of an RC filter in silicon photonics that serves as a case study to show that bulky passive RF components in a receiver front end can be built with silicon photonics. S-parameters were measured and the measured values of the RC filter are compared to design values and a circuit model was extracted. This demonstrates the feasibility of implementing bulky passive components in silicon photonics.

Finally, an electro-optic moving average integrator implemented with silicon photonics was demonstrated. The performance of the SiP moving average integrator was experimentally verified by using a high impedance probe. As a proof-of-concept for the feasibility of implementing such a structure, experiments at 2.5 and 5 Gbps were performed. Because a moving average filter removes the need for a reset signal, the outcome of this structure demonstrates the feasibility of an electro-optic moving average integrator implemented in silicon photonics for use in high sensitivity and energy-efficient optical resettable receivers proposed in [18] and [47].

- [1] Cisco, "Cisco global cloud index: Forecast and methodology, 2015–2020," 2016.
- [2] I. A. Young, E. M. Mohammed, J. T. Liao, A. M. Kern, S. Palermo, B. A. Block, M. R. Reshotko, and P. L. Chang, "Optical technology for energy efficient I/O in high performance computing," *IEEE Communications Magazine*, vol. 48, no. 10, pp. 184– 191, 2010.
- [3] K. Yu, C. Li, H. Li, A. Titriku, A. Shafik, B. Wang, Z. Wang, R. Bai, C.-H. Chen, M. Fiorentino, et al., "A 25 Gb/s hybrid-integrated silicon photonic source-synchronous receiver with microring wavelength stabilization," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 9, pp. 2129–2141, 2016.
- [4] A. Narasimha, B. Analui, Y. Liang, T. J. Sleboda, and C. Gunn, "A fully integrated 4× 10 Gb/s DWDM optoelectronic transceiver in a standard 0.13 μm CMOS SOI," in 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, pp. 42–586, 2007.

[5] B. Radi, M. S. Nezami, M. Ménard, F. Nabki, and O. Liboiron-Ladouceur, "A 12.5 Gb/s 1.93 pJ/bit optical receiver exploiting silicon photonic delay lines for clock phases generation replacement," *IEEE Transactions on Circuits and Systems* II: Express Briefs, 2019 (Early access).

- [6] M. Taherzadeh-Sani, B. Radi, M. S. Nezami, M. Ménard, O. Liboiron-Ladouceur, and F. Nabki, "A 17 Gbps 156 fJ/bit two-channel optical receiver with optical-input split and delay in 65 nm CMOS," *IEEE Transactions on Circuits and Systems I: Regular Papers*, 2020 (Early access).
- [7] B. Radi, M. Taherzadeh-Sani, M. S. Nezami, F. Nabki, M. Ménard, and O. Liboiron-Ladouceur, "A 22 Gb/s time-interleaved low-power optical receiver with a two-bit integrating front-end," *IEEE Journal of Solid-State Circuits(Accepted, ID: JSSC-19-0447.R2)*.
- [8] B. Radi, A. S. Dhillon, and O. Liboiron-Ladouceur, "Demonstration of inter-chip RF data transmission using on-chip antennas in silicon photonics," *IEEE Photonics Technology Letters*, vol. 32, no. 11, pp. 659–662, 2020.
- [9] M. S. Nezami, B. Radi, A. Gour, Y. Xiong, M. Taherzadeh-Sani, M. Ménard, F. Nabki, and O. Liboiron-Ladouceur, "Integrated RF passive low-pass filters in silicon photonics," *IEEE Photonics Technology Letters*, vol. 30, no. 23, pp. 2052–2055, 2018.

[10] M. S. Nezami, B. Radi, M. Taherzadeh-Sani, Y. Xiong, M. Ménard, F. Nabki, and O. Liboiron-Ladouceur, "A high-speed moving average integrator in silicon photonics for TIA-less receivers," *IEEE Photonics Technology Letters (Accepted, ID: PTL-37154-2020.R1)*.

- [11] B. Radi and O. Liboiron-Ladouceur, "A survey of optical and electronic delay lines with a case study on using optical delay lines in 65nm CMOS optical receivers," in 2020 IEEE International Midwest Symposium on Circuits and Systems(MWSCAS) (Accepted, Paper ID = 3226, Conference date: August 9th August 12th, 2020).
- [12] Y. Xiong, F. G. de Magalhães, B. Radi, G. Nicolescu, F. Hessel, and O. Liboiron-Ladouceur, "Towards a fast centralized controller for integrated silicon photonic multistage MZI-based switches," in 2016 Optical Fiber Communications Conference and Exhibition (OFC), pp. 1–3, 2016.
- [13] V. E. Paul, B. Radi, V. Tolstikin, and O. Liboiron-Ladouceur, "A technology-based comparative study for the optoelectronic integration of optical front-ends," in 2016 Photonics North (PN), pp. 1–1, 2016.
- [14] B. Radi, V. E. Paul, V. Tolstikhin, and O. Liboiron-Ladouceur, "Comparative study of optoelectronics receiver front-end implementation in InP, SiGe, and CMOS," in 2016 IEEE Photonics Conference (IPC), pp. 222–223, 2016.

[15] H. R. Mojaver, A. Das, B. Radi, V. Tolstikhin, K.-W. Leong, and O. Liboiron-Ladouceur, "Scalable SOA-based lossless photonic switch in InP platform," in Optical Interconnects 2020 (Accepted, Paper ID = 25, Conference date: September 27th – Oct 1st, 2020).

- [16] B. Radi, A. S. Dhillon, and O. Liboiron-Ladouceur, "Towards integrated rf photodetector-antenna emitters in silicon photonics," in 2020 IEEE Photonics Conference (IPC) (Accepted, Paper ID = 147, Conference date: September 28th October 1st, 2020).
- [17] A. Novack, Y. Liu, R. Ding, M. Gould, T. Baehr-Jones, Q. Li, Y. Yang, Y. Ma, Y. Zhang, K. Padmaraju, et al., "A 30 GHz silicon photonic platform," in 10th International Conference on Group IV Photonics, pp. 7–8, 2013.
- [18] M. S. Hai, M. Ménard, and O. Liboiron-Ladouceur, "Integrated optical deserialiser time sampling based SiGe photoreceiver," Optics Express, vol. 23, no. 25, pp. 31736– 31754, 2015.
- [19] S. Liao, N.-N. Feng, D. Feng, P. Dong, R. Shafiiha, C.-C. Kung, H. Liang, W. Qian, Y. Liu, J. Fong, et al., "36 GHz submicron silicon waveguide germanium photodetector," Optics Express, vol. 19, no. 11, pp. 10967–10972, 2011.
- [20] CMC Microsystems, "Fabrication and pricing." https://www.cmc.ca/technologies/, 2019.

[21] A. Babakhani, X. Guan, A. Komijani, A. Natarajan, and A. Hajimiri, "A 77-GHz phased-array transceiver with on-chip antennas in silicon: Receiver and antennas," IEEE Journal of Solid-State Circuits, vol. 41, no. 12, pp. 2795–2806, 2006.

- [22] P. A. Nuyts, P. Reynaert, and W. Dehaene, "Continuous-time digital design techniques," in Continuous-time Digital Front-ends for Multistandard Wireless Transmission, pp. 125–185, Springer, 2014.
- [23] M. A. Abas, G. Russell, and D. Kinniment, "Built-in time measurement circuits—a comparative design study," *IET Computers & Digital Techniques*, vol. 1, no. 2, pp. 87– 97, 2007.
- [24] G.-K. Dehng, J.-M. Hsu, C.-Y. Yang, and S.-I. Liu, "Clock-deskew buffer using a SAR-controlled delay-locked loop," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 8, pp. 1128–1136, 2000.
- [25] J. G. Maneatis, "Low-jitter process-independent DLL and PLL based on self-biased techniques," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 11, pp. 1723–1732, 1996.
- [26] C.-K. K. Yang et al., "Delay-locked loops-an overview," in Phase Locking in High Performance Systems, pp. 13–22, IEEE Press, 2003.
- [27] M. Maymandi-Nejad and M. Sachdev, "A monotonic digitally controlled delay element," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 11, pp. 2212–2219, 2005.

[28] Santec, "Santec ODL-330 manual." https://www.santec.com/en/support/manual. html.

- [29] X. Wang, L. Zhou, R. Li, J. Xie, L. Lu, K. Wu, and J. Chen, "Continuously tunable ultra-thin silicon waveguide optical delay line," *Optica*, vol. 4, no. 5, pp. 507–515, 2017.
- [30] O. Liboiron-Ladouceur, M. S. Hai, and M. Menard, "Time sampled photodetector devices and methods," Mar. 13 2018. US Patent 9,917,650.
- [31] J. Hwang, H. Do, H.-S. Choi, G.-S. Jeong, D. Koh, S. Kim, and D.-K. Jeong, "56Gb/s PAM-4 VCSEL transmitter with quarter-rate forwarded clock using 65nm CMOS circuits," in *Optical Fiber Communication Conference*, pp. W2A–1, Optical Society of America, 2019.
- [32] J. Kim, A. Balankutty, R. Dokania, A. Elshazly, H. S. Kim, S. Kundu, S. Weaver, K. Yu, and F. O'Mahony, "A 112Gb/s PAM-4 transmitter with 3-tap FFE in 10nm CMOS," in 2018 IEEE International Solid-State Circuits Conference-(ISSCC), pp. 102–104, 2018.
- [33] D. Kim, S. Choi, J. Choi, and J. J. Kim, "A reconfigurable multiphase *LC*-ring structure for programmable frequency multiplication," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 62, no. 1, pp. 51–55, 2014.

[34] Y.-S. Lee, W.-H. Ho, and W.-Z. Chen, "A 25-Gb/s, 2.1-pJ/bit, fully integrated optical receiver with a baud-rate clock and data recovery," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 8, pp. 2243–2254, 2019.

- [35] M. H. Nazari and A. Emami-Neyestanak, "A 24-Gb/s double-sampling receiver for ultra-low-power optical communication," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 2, pp. 344–357, 2012.
- [36] A. Sharif-Bakhtiar and A. C. Carusone, "A 20 Gb/s CMOS optical receiver with limited-bandwidth front end and local feedback IIR-DFE," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 11, pp. 2679–2689, 2016.
- [37] Telecommunication Standardization Sector of ITU, "Forward error correction for high bit-rate DWDM submarine systems," 2014.
- [38] G.-S. Jeong, J. Hwang, H.-S. Choi, H. Do, D. Koh, D. Yun, J. Lee, K. Park, H.-G. Ko, K. Lee, et al., "25-Gb/s clocked pluggable optics for high-density data center interconnections," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 65, no. 10, pp. 1395–1399, 2018.
- [39] M. S. Hai, M. Ménard, and O. Liboiron-Ladouceur, "A 20 Gb/s SiGe photoreceiver based on optical time sampling," in 2015 European Conference on Optical Communication (ECOC), pp. 1–3, 2015.

[40] X. Wang, L. Zhou, R. Li, J. Xie, L. Lu, and J. Chen, "Nanosecond-range continuously tunable silicon optical delay line using ultra-thin silicon waveguides," in *CLEO: Science and Innovations*, pp. STu1G–5, Optical Society of America, 2016.

- [41] J. Komma, C. Schwarz, G. Hofmann, D. Heinert, and R. Nawrodt, "Thermo-optic coefficient of silicon at 1550 nm and cryogenic temperatures," *Applied Physics Letters*, vol. 101, no. 4, p. 041905, 2012.
- [42] B. Nauta, "A CMOS transconductance-C filter technique for very high frequencies,"

  IEEE Journal of Solid-State Circuits, vol. 27, no. 2, pp. 142–153, 1992.
- [43] Y. Sinangil and A. P. Chandrakasan, "A 128 kbit SRAM with an embedded energy monitoring circuit and sense-amplifier offset compensation using body biasing," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 11, pp. 2730–2739, 2014.
- [44] J. Kim, A. Balankutty, R. K. Dokania, A. Elshazly, H. S. Kim, S. Kundu, D. Shi, S. Weaver, K. Yu, and F. O'Mahony, "A 112 Gb/s PAM-4 56 Gb/s NRZ reconfigurable transmitter with three-tap FFE in 10-nm FinFET," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 1, pp. 29–42, 2018.
- [45] J.-R. Schrader, E. A. Klumperink, J. L. Visschers, and B. Nauta, "Pulse-width modulation pre-emphasis applied in a wireline transmitter, achieving 33 dB loss compensation at 5-Gb/s in 0.13-μm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 4, pp. 990–999, 2006.

[46] S. Palermo, A. Emami-Neyestanak, and M. Horowitz, "A 90 nm CMOS 16 Gb/s transceiver for optical interconnects," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 5, pp. 1235–1246, 2008.

- [47] S.-H. Huang and W.-Z. Chen, "A 25 Gb/s 1.13 pJ/b -10.8 dBm input sensitivity optical receiver in 40 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 3, pp. 747–756, 2017.
- [48] S. Saeedi, S. Menezo, G. Pares, and A. Emami, "A 25 Gb/s 3D-integrated CMOS/silicon-photonic receiver for low-power high-sensitivity optical communication," *Journal of Lightwave Technology*, vol. 34, no. 12, pp. 2924–2933, 2015.
- [49] S. Saeedi and A. Emami, "A 25Gb/s 170μW/Gb/s optical receiver in 28nm CMOS for chip-to-chip optical communication," in 2014 IEEE Radio Frequency Integrated Circuits Symposium, pp. 283–286, 2014.
- [50] M. Georgas, J. Orcutt, R. J. Ram, and V. Stojanovic, "A monolithically-integrated optical receiver in standard 45-nm SOI," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 7, pp. 1693–1702, 2012.
- [51] S. Sidiropoulos and M. Horowitz, "Current integrating receivers for high speed system interconnects," in *Proceedings of the IEEE 1995 Custom Integrated Circuits Conference*, pp. 107–110, 1995.

[52] S.-H. Huang, Z.-H. Hung, and W.-Z. Chen, "A 2× 20-Gb/s, 1.2-pJ/bit, time-interleaved optical receiver in 40-nm CMOS," in 2014 IEEE Asian Solid-State Circuits Conference (A-SSCC), pp. 97–100, 2014.

- [53] A. Sharif-Bakhtiar, M. G. Lee, and A. C. Carusone, "Low-power CMOS receivers for short reach optical communication," in 2017 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–8, 2017.
- [54] J. E. Proesel, Z. Toprak-Deniz, A. Cevrero, I. Ozkaya, S. Kim, D. M. Kuchta, S. Lee, S. V. Rylov, H. Ainspan, T. O. Dickson, et al., "A 32 Gb/s, 4.7 pJ/bit optical link with- 11.7 dBm sensitivity in 14-nm FinFET CMOS," IEEE Journal of Solid-State Circuits, vol. 53, no. 4, pp. 1214–1226, 2017.
- [55] A. Cevrero, I. Ozkaya, P. A. Francese, C. Menolfi, T. Morf, M. Brandli, D. Kuchta, L. Kull, J. Proesel, M. Kossel, et al., "A 64Gb/s 1.4 pJ/b NRZ optical-receiver datapath in 14 nm CMOS FinFET," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 482–483, Ieee, 2017.
- [56] J. V. Olmos, L. F. Suhr, B. Li, and I. T. Monroy, "Five-level polybinary signaling for 10 Gbps data transmission systems," *Optics Express*, vol. 21, no. 17, pp. 20417–20422, 2013.

[57] M. Balakrishnan, T. Marian, K. P. Birman, H. Weatherspoon, and L. Ganesh, "Maelstrom: transparent error correction for communication between data centers," IEEE/ACM Transactions On Networking, vol. 19, no. 3, pp. 617–629, 2011.

- [58] K. Giewont, K. Nummy, F. A. Anderson, J. Ayala, T. Barwicz, Y. Bian, K. K. Dezfulian, D. M. Gill, T. Houghton, S. Hu, et al., "300-mm monolithic silicon photonics foundry technology," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 25, no. 5, pp. 1–11, 2019.
- [59] M. M. P. Fard, G. Cowan, and O. Liboiron-Ladouceur, "Responsivity optimization of a high-speed germanium-on-silicon photodetector," Optics Express, vol. 24, no. 24, pp. 27738–27752, 2016.
- [60] M. Raj, Y. Frans, P.-C. Chiang, S. L. C. Ambatipudi, D. Mahashin, P. De Heyn, S. Balakrishnan, J. Van Campenhout, J. Grayson, M. Epitaux, et al., "Design of a 50 Gb/s hybrid integrated Si-photonic optical link in 16-nm FinFET," IEEE Journal of Solid-State Circuits, vol. 55, no. 4, pp. 1086–1095, 2020.
- [61] C. Sun, M. Georgas, J. Orcutt, B. Moss, Y.-H. Chen, J. Shainline, M. Wade, K. Mehta, K. Nammari, E. Timurdogan, et al., "A monolithically-integrated chip-to-chip optical link in bulk CMOS," IEEE Journal of Solid-State Circuits, vol. 50, no. 4, pp. 828–844, 2015.

[62] S. Assefa, H. Pan, S. Shank, W. M. Green, A. Rylyakov, C. Schow, M. Khater, S. Kamlapurkar, E. Kiewra, C. Reinholm, et al., "Monolithically integrated silicon nanophotonics receiver in 90nm CMOS technology node," in 2013 Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC), pp. 1–3, 2013.

- [63] L. Bogaert, J. Verbist, K. Van Gasse, G. Torfs, J. Bauwelinck, and G. Roelkens, "Germanium photodetector with monolithically integrated narrowband matching network on a silicon photonics platform," in *European Conference on Integrated Optics* (ECIO) 2019, 2019.
- [64] D. Jäger, "Microwave photonics," in *Microwave photonics*, p. 1, CRC Press, 2017.
- [65] A. Stohr, S. Babiel, P. J. Cannard, B. Charbonnier, F. van Dijk, S. Fedderwitz, D. Moodie, L. Pavlovic, L. Ponnampalam, C. C. Renaud, et al., "Millimeterwave photonic components for broadband wireless systems," *IEEE Transactions on Microwave Theory and Techniques*, vol. 58, no. 11, pp. 3071–3082, 2010.
- [66] Q. Zhou, A. S. Cross, A. Beling, Y. Fu, Z. Lu, and J. C. Campbell, "High-power V-band InGaAs/InP photodiodes," *IEEE Photonics Technology Letters*, vol. 25, no. 10, pp. 907–909, 2013.

[67] Q. Li, K. Li, Y. Fu, X. Xie, Z. Yang, A. Beling, and J. C. Campbell, "High-power flip-chip bonded photodiode with 110 GHz bandwidth," *Journal of Lightwave Technology*, vol. 34, no. 9, pp. 2139–2144, 2016.

- [68] A. Stohr, R. Heinzelmann, C. Kaczmarek, and D. Jager, "Ultra-broadband  $k_a$  to W-band 1.55  $\mu$ m travelling-wave photomixer," *Electronics Letters*, vol. 36, no. 11, pp. 970–972, 2000.
- [69] N.-W. Chen, H.-J. Tsai, F.-M. Kuo, and J.-W. Shi, "High-speed W-band integrated photonic transmitter for radio-over-fiber applications," *IEEE Transactions on Microwave Theory and Techniques*, vol. 59, no. 4, pp. 978–986, 2011.
- [70] K. Li, X. Xie, Q. Li, Y. Shen, M. E. Woodsen, Z. Yang, A. Beling, and J. C. Campbell, "High-power photodiode integrated with coplanar patch antenna for 60 GHz applications," *IEEE Photonics Technology Letters*, vol. 27, no. 6, pp. 650–653, 2015.
- [71] J. Moody, K. Sun, Q. Li, A. Beling, and S. M. Bowers, "A vivaldi antenna based W-band MUTC photodiode driven radiator," in 2016 IEEE International Topical Meeting on Microwave Photonics (MWP), pp. 217–220, 2016.
- [72] T. Umezawa, K. Jitsuno, P. T. Dat, K. Kashima, A. Kanno, N. Yamamoto, and T. Kawanishi, "Millimeter-wave integrated photoreceivers for high data rate photonic wireless communication," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 24, no. 2, pp. 1–9, 2017.

[73] M. Natrella, C.-P. Liu, C. Graham, F. van Dijk, H. Liu, C. C. Renaud, and A. J. Seeds, "Modelling and measurement of the absolute level of power radiated by antenna integrated THz UTC photodiodes," Optics Express, vol. 24, no. 11, pp. 11793–11807, 2016.

- [74] K. Sun, J. Moody, Q. Li, S. M. Bowers, and A. Beling, "High power integrated photonic W-band emitter," *IEEE Transactions on Microwave Theory and Techniques*, vol. 66, no. 3, pp. 1668–1677, 2017.
- [75] F. T. Ulaby, E. Michielssen, and U. Ravaioli, "Fundamentals of applied electromagnetics 6e," *Boston, Massachussetts: Prentice Hall*, 2010.
- [76] I. Papapolymerou, R. F. Drayton, and L. P. Katehi, "Micromachined patch antennas," IEEE Transactions on Antennas and Propagation, vol. 46, no. 2, pp. 275–283, 1998.
- [77] Nordic semiconductor, "l/4 printed monopole antenna for 2.45 GHz." https://infocenter.nordicsemi.com/pdf/nwp 008.pdf?cp=12 18, 2005.
- [78] M. Gebhart, T. Baier, and M. Facchini, "Automated antenna impedance adjustment for near field communication (NFC)," in *Proceedings of the 12th International Conference on Telecommunications*, pp. 235–242, 2013.
- [79] T. Lecklider, "The world of the near field: when scotty is beaming up, he's working in the very far field," *EE-Evaluation Engineering*, vol. 44, no. 10, pp. 52–56, 2005.

[80] B. A. Floyd, C.-M. Hung, et al., "Intra-chip wireless interconnect for clock distribution implemented with integrated antennas, receivers, and transmitters," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 5, pp. 543–552, 2002.

- [81] Y. Toeda, T. Fujimaki, M. Hamada, and T. Kuroda, "Fully integrated OOK-powered pad-less deep sub-wavelength-sized 5-GHz RFID with on-chip antenna using adiabatic logic in 0.18 μm CMOS," in 2018 IEEE Symposium on VLSI Circuits, pp. 27–28, 2018.
- [82] J. Geng, L. Zhang, C. Zhu, L. Dai, X. Shi, and H. Qian, "An oscillator-based CMOS magnetosensitive microarray biochip with on-chip inductor optimization methodology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 66, no. 5, pp. 2556–2569, 2018.
- [83] S. Tanaka, T. Simoyama, T. Aoki, T. Mori, S. Sekiguchi, S.-H. Jeong, T. Usuki, Y. Tanaka, and K. Morito, "Ultra low-power (1.59 mw/Gbps), 56 Gbps PAM 4 operation of Si photonic transmitter integrating segmented PIN Mach-Zehnder modulator and 28-nm CMOS driver," *Journal of Lightwave Technology*, vol. 36, no. 5, pp. 1275–1280, 2018.
- [84] C. Xiong, D. M. Gill, J. E. Proesel, J. S. Orcutt, W. Haensch, and W. M. Green, "Monolithic 56 Gb/s silicon photonic pulse-amplitude modulation transmitter," Optica, vol. 3, no. 10, pp. 1060–1065, 2016.

[85] V. Stojanović, R. J. Ram, M. Popović, S. Lin, S. Moazeni, M. Wade, C. Sun, L. Alloatti, A. Atabaki, F. Pavanello, et al., "Monolithic silicon-photonic platforms in state-of-the-art CMOS SOI processes," Optics Express, vol. 26, no. 10, pp. 13106–13121, 2018.

- [86] T. Baehr-Jones, T. Pinguet, P. L. Guo-Qiang, S. Danziger, D. Prather, and M. Hochberg, "Myths and rumours of silicon photonics," *Nature Photonics*, vol. 6, no. 4, pp. 206–208, 2012.
- [87] P. De Dobbelaere et al., "Advanced silicon photonics technology platform leveraging a semiconductor supply chain," in 2017 IEEE International Electron Devices Meeting (IEDM), pp. 34.1.1–34.1.4, 2017.
- [88] A. E.-J. Lim, J. Song, Q. Fang, C. Li, X. Tu, N. Duan, K. K. Chen, R. P.-C. Tern, and T.-Y. Liow, "Review of silicon photonics foundry efforts," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 20, no. 4, pp. 405–416, 2013.
- [89] J.-P. Raskin, "SOI technology: An opportunity for RF designers?," Journal of Telecommunications and Information Technology, pp. 3–17, 2009.
- [90] K. Benaissa, J.-Y. Yang, D. Crenshaw, B. Williams, S. Sridhar, J. Ai, G. Boselli, S. Zhao, S. Tang, S. Ashburn, et al., "RF CMOS on high-resistivity substrates for system-on-chip applications," *IEEE Transactions on Electron Devices*, vol. 50, no. 3, pp. 567–576, 2003.

[91] J.-P. Raskin, A. Viviani, D. Flandre, and J.-P. Colinge, "Substrate crosstalk reduction using SOI technology," *IEEE Transactions on Electron Devices*, vol. 44, no. 12, pp. 2252–2261, 1997.

- [92] T. Zheng, M. Han, G. Xu, and L. Luo, "Design and fabrication of suspended high Q MIM capacitors by wafer level packaging technology," in 2015 16th International Conference on Electronic Packaging Technology (ICEPT), pp. 89–94, 2015.
- [93] A. S. Sedra and K. C. Smith, Microelectronic circuits: theory and applications. Oxford University Press, 2013.
- [94] J.-M. Lee and W.-Y. Choi, "An equivalent circuit model for germanium waveguide vertical photodetectors on Si," in *Microwave Photonics (MWP) and the 2014 9th Asia-Pacific Microwave Photonics Conference (APMP) 2014 International Topical Meeting on*, pp. 139–141, 2014.
- [95] M. S. Hai, M. N. Sakib, and O. Liboiron-Ladouceur, "A 16 GHz silicon-based monolithic balanced photodetector with on-chip capacitors for 25 Gbaud front-end receivers," Optics Express, vol. 21, no. 26, pp. 32680–32689, 2013.
- [96] L. Chrostowski and M. Hochberg, Silicon photonics design: from devices to systems. Cambridge University Press, 2015.

[97] W.-C. Chuang, C.-W. Wang, W.-C. Chu, P.-Z. Chang, and Y.-C. Hu, "The fringe capacitance formula of microstructures," *Journal of Micromechanics and Microengineering*, vol. 22, no. 2, p. 025015, 2012.

- [98] J. C. Irvin, "Resistivity of bulk silicon and of diffused layers in silicon," *Bell System Technical Journal*, vol. 41, no. 2, pp. 387–410, 1962.
- [99] K. Takeuchi, A. Nishida, and T. Hiramoto, "Random fluctuations in scaled MOS devices," in 2009 International Conference on Simulation of Semiconductor Processes and Devices, pp. 1–7, 2009.
- [100] W. A. Zortman, D. C. Trotter, and M. R. Watts, "Silicon photonics manufacturing," Optics Express, vol. 18, no. 23, pp. 23598–23607, 2010.
- [101] A. Aktas and M. Ismail, "CMOS PLL calibration techniques," IEEE Circuits and Devices Magazine, vol. 20, no. 5, pp. 6–11, 2004.

In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of McGill University's products or services. Internal or personal use of this material is permitted. If interested in reprinting/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to:

www.ieee.org/publications\_standards/publications/rights/rights\_link

to learn how to obtain a License from RightsLink. If applicable, University Microfilms and/or ProQuest Library, or the Archives of Canada may supply single copies of the dissertation.