JP2019525271A

JP2019525271A - Command arbitration for high-speed memory interface

Info

Publication number: JP2019525271A
Application number: JP2018524749A
Authority: JP
Inventors: アール．マグロジェームズ; バラクリシュナンケダーナス; ペンジャクソン; カナヤマヒデキ
Original assignee: Advanced Micro Devices Inc
Current assignee: Advanced Micro Devices Inc
Priority date: 2016-07-15
Filing date: 2016-09-22
Publication date: 2019-09-05
Anticipated expiration: 2036-09-22
Also published as: US20180018291A1; JP6840145B2; KR20190022428A; CN107924375A; CN107924375B; KR102442078B1; WO2018013157A1; US10684969B2

Abstract

一形態では、メモリコントローラは、コマンドキューと、アービタと、を含む。コマンドキューは、メモリアクセス要求を受信して記憶する。アービタは、コントローラサイクル中にメモリアクセス要求の中から対応する複数のサブアービトレーション勝者を提供する複数のサブアービタであって、対応するコントローラサイクルにおいて複数のメモリコマンドを提供するために複数のサブアービトレーション勝者の中から何れかを選択する複数のサブアービタを含む。他の形態では、データ処理システムは、メモリアクセス要求を提供するメモリアクセスエージェントと、メモリシステムと、メモリアクセスエージェント及びメモリシステムに接続されたメモリコントローラと、を含む。
【選択図】図６In one form, the memory controller includes a command queue and an arbiter. The command queue receives and stores the memory access request. The arbiter is a plurality of sub-arbiters that provide corresponding sub-arbitration winners from among memory access requests during a controller cycle, and the sub-arbitration winners are provided to provide a plurality of memory commands in the corresponding controller cycle. A plurality of sub-arbiters for selecting one of them are included. In another form, a data processing system includes a memory access agent that provides a memory access request, a memory system, and a memory controller connected to the memory access agent and the memory system.
[Selection] Figure 6

Description

本開示は、概して、データ処理システムに関し、より詳細には、高速メモリインタフェースを有するデータ処理システムで使用されるメモリコントローラに関する。 The present disclosure relates generally to data processing systems, and more particularly to memory controllers for use in data processing systems having high speed memory interfaces.

コンピュータシステムは、通常、安価で高密度のダイナミックランダムアクセスメモリ（ＤＲＡＭ）チップをメインメモリとして使用する。今日販売されている多くのＤＲＡＭチップは、半導体技術協会（ＪＥＤＥＣ：Joint Electron Devices Engineering Council）によって公布された様々なダブルデータレート（ＤＤＲ）ＤＲＡＭ規格と互換性がある。ＤＤＲＤＲＡＭは、高速アクセス回路を有する従来のＤＲＡＭメモリセルアレイを使用して、高い転送レートを達成し、メモリバスの利用を改善する。例えば、ＤＤＲ４ＤＲＡＭは、１２〜１５ナノ秒（ｎｓ）のアクセス時間を必要とするメモリセルアレイを使用するが、１．６ギガヘルツ（ＧＨｚ）のメモリクロック周波数に対応して最大３．２ギガトランスファー毎秒（ＧＴ／秒）の速度で大量のデータにアクセスし、データをシリアル化する。転送は、良好な伝送ライン性能のために、オンダイターミネーション（on-die termination）を有する疑似オープンドレイン技術を使用する。そのレートでポイントツーポイントインタフェースを動作させて高速転送を達成することが可能であるが、メモリコントローラがメモリアクセスをスケジュールするのに十分な速度で動作することが、ますます困難になっている。 Computer systems typically use inexpensive, high density dynamic random access memory (DRAM) chips as main memory. Many DRAM chips sold today are compatible with various double data rate (DDR) DRAM standards promulgated by the Joint Electron Devices Engineering Council (JEDEC). DDR DRAM uses a conventional DRAM memory cell array with high-speed access circuitry to achieve high transfer rates and improve memory bus utilization. For example, DDR4 DRAM uses a memory cell array that requires an access time of 12-15 nanoseconds (ns), but up to 3.2 gigatransfers per second corresponding to a memory clock frequency of 1.6 gigahertz (GHz). A large amount of data is accessed at a rate of (GT / second), and the data is serialized. The transfer uses a pseudo open drain technique with on-die termination for good transmission line performance. Although it is possible to operate the point-to-point interface at that rate to achieve high-speed transfers, it becomes increasingly difficult for the memory controller to operate at a rate sufficient to schedule memory accesses.

典型的なＤＤＲメモリコントローラは、待ち状態の読出し及び書込み要求を記憶するためにキューを維持して、メモリコントローラが、待ち状態の要求をアウトオブオーダで選択することによって、効率を高めるのを可能にする。例えば、メモリコントローラは、現在の行をプリチャージし、別の行を繰り返しアクティブにするオーバヘッドを回避するために、メモリの所定ランク内の同じ行に対する複数のメモリアクセス要求（「ページヒット」と呼ばれる）をアウトオブオーダでキューから取り出し、これらの要求を連続してメモリシステムに発行することができる。しかしながら、ＤＤＲ４等の最新のメモリ技術で利用可能なバス帯域幅を活用しながら、深いキューからのアクセスをスキャン及び取り出すことを、既知のメモリコントローラを用いて達成することが困難になってきている。 A typical DDR memory controller maintains a queue to store pending read and write requests, allowing the memory controller to increase efficiency by selecting pending requests out of order To. For example, the memory controller may request multiple memory access requests (called “page hits”) for the same row in a given rank of memory to avoid the overhead of precharging the current row and repeatedly activating another row. ) From the queue out of order and these requests can be issued to the memory system in succession. However, scanning and retrieving accesses from deep queues while taking advantage of the bus bandwidth available with modern memory technologies such as DDR4 has become difficult to achieve using known memory controllers. .

いくつかの実施形態による、データ処理システムのブロック図である。FIG. 1 is a block diagram of a data processing system according to some embodiments. 図１のデータ処理システムでの使用に適したアクセラレーテッドプロセッシングユニット（ＡＰＵ）のブロック図である。2 is a block diagram of an accelerated processing unit (APU) suitable for use in the data processing system of FIG. いくつかの実施形態による、図２のＡＰＵでの使用に適したメモリコントローラ及び関連する物理インタフェース（ＰＨＹ）のブロック図である。FIG. 3 is a block diagram of a memory controller and associated physical interface (PHY) suitable for use with the APU of FIG. 2 according to some embodiments. いくつかの実施形態による、図２のＡＰＵでの使用に適した他のメモリコントローラ及び関連するＰＨＹのブロック図である。FIG. 3 is a block diagram of another memory controller and associated PHY suitable for use with the APU of FIG. 2 according to some embodiments. いくつかの実施形態による、メモリコントローラのブロック図である。FIG. 3 is a block diagram of a memory controller, according to some embodiments. いくつかの実施形態による、図５のアービタとして使用され得るアービタのブロック図である。FIG. 6 is a block diagram of an arbiter that may be used as the arbiter of FIG. 5 according to some embodiments.

以下の説明において、異なる図面において同じ参照番号を使用することは、類似又は同一の項目を示している。特に断らない限り、「接続された」という用語及びこれに関連する動詞形は、当技術分野において既知の手段による直接接続及び間接的な電気接続の両方を含む。特に断らない限り、直接接続の説明は、適切な形態の間接的な電気接続を使用する代替の実施形態をも意味する。 In the following description, the use of the same reference numerals in different drawings indicates similar or identical items. Unless otherwise noted, the term “connected” and its related verb forms include both direct and indirect electrical connections by means known in the art. Unless otherwise stated, the description of direct connection also means an alternative embodiment using an appropriate form of indirect electrical connection.

以下の一形態で説明するように、メモリコントローラは、コマンドキューと、アービタと、を含む。コマンドキューは、メモリアクセス要求を受信及び記憶するためのものである。アービタは、コントローラサイクル中にメモリアクセス要求の中から対応する複数のサブアービトレーションの勝者（winner）を提供する複数のサブアービタを含み、複数のサブアービトレーションの勝者の中から何れかを選択して、対応するコントローラサイクルにおいて複数のメモリコマンドを提供する。いくつかの実施形態では、メモリコマンドサイクルは、コントローラサイクルより短くてもよい。例えば、コントローラは、コントローラクロック信号に従って動作する一方で、メモリサイクルは、コントローラクロック信号よりも高い周波数を有するメモリクロック信号によって規定される。複数のサブアービタは、コマンドキュー内のページヒットコマンドの中から第１サブアービトレーション勝者を選択する第１サブアービタと、コマンドキュー内のページ競合コマンドの中から第２サブアービトレーション勝者を選択する第２サブアービタと、コマンドキュー内のページミスコマンドの中から第３サブアービトレーション勝者を選択する第３サブアービタと、を含むことができる。アービタは、第１サブアービトレーション勝者、第２サブアービトレーション勝者、及び、第３サブアービトレーション勝者の中から何れかを選択するための最終アービタ（final arbiter）をさらに含むことができる。 As will be described in one form below, the memory controller includes a command queue and an arbiter. The command queue is for receiving and storing a memory access request. The arbiter includes multiple sub-arbiters that provide winners of the corresponding sub-arbitration from among the memory access requests during the controller cycle, and select one of the multiple sub-arbitration winners to respond A plurality of memory commands are provided in the controller cycle. In some embodiments, the memory command cycle may be shorter than the controller cycle. For example, the controller operates in accordance with the controller clock signal while the memory cycle is defined by a memory clock signal having a higher frequency than the controller clock signal. The plurality of sub-arbiters include a first sub-arbiter that selects a first sub-arbitration winner from page hit commands in the command queue, and a second sub-arbiter that selects a second sub-arbitration winner from page conflict commands in the command queue; A third sub-arbiter for selecting a third sub-arbitration winner from the page miss commands in the command queue. The arbiter may further include a final arbiter for selecting one of the first sub-arbitration winner, the second sub-arbitration winner, and the third sub-arbitration winner.

別の形態では、データ処理システムは、複数のメモリアクセス要求を提供するメモリアクセスエージェントと、メモリシステムと、メモリアクセスエージェント及びメモリシステムに接続されたメモリコントローラと、を含む。メモリコントローラは、コマンドキューと、アービタとを含む。コマンドキューは、メモリアクセスエージェントから受信したメモリアクセスコマンドを記憶する。アービタは、コントローラサイクル中にメモリアクセス要求の中から対応する複数のサブアービトレーション勝者を提供し、複数のサブアービトレーション勝者の中から何れかを選択して、対応するコントローラサイクルにおいて複数のメモリコマンドを提供する複数のサブアービタを含む。 In another form, a data processing system includes a memory access agent that provides a plurality of memory access requests, a memory system, and a memory controller connected to the memory access agent and the memory system. The memory controller includes a command queue and an arbiter. The command queue stores a memory access command received from the memory access agent. The arbiter provides corresponding sub-arbitration winners from among the memory access requests during the controller cycle, and selects one of the sub-arbitration winners to provide multiple memory commands in the corresponding controller cycle Including multiple sub-arbiters.

さらに別の形態では、性能及び効率を向上させるために、メモリアクセス要求間でアービトレーションを行う方法を使用することができる。複数のメモリアクセス要求が受信され、コマンドキューに記憶される。第１コントローラサイクル中に、メモリアクセス要求の中から複数のサブアービトレーション勝者が選択される。複数のサブアービトレーション勝者の中から複数のメモリコマンドが選択され、対応する複数のメモリコマンドサイクルにおいて提供される。 In yet another aspect, a method of arbitrating between memory access requests can be used to improve performance and efficiency. Multiple memory access requests are received and stored in the command queue. During the first controller cycle, a plurality of sub-arbitration winners are selected from the memory access requests. A plurality of memory commands are selected from the plurality of sub-arbitration winners and provided in a corresponding plurality of memory command cycles.

図１は、いくつかの実施形態によるデータ処理システム１００のブロック図である。データ処理システム１００は、概して、アクセラレーテッドプロセッシングユニット（ＡＰＵ）の形態のデータプロセッサ１１０と、メモリシステム１２０と、周辺機器相互接続エクスプレス（ＰＣＩｅ）システム１５０と、ユニバーサルシリアルバス（ＵＳＢ）システム１６０と、ディスクドライブ１７０と、を含む。データプロセッサ１１０は、データ処理システム１００の中央処理装置（ＣＰＵ）として動作し、現代のコンピュータシステムにおいて有用な様々なバス及びインタフェースを提供する。これらのインタフェースには、２つのダブルデータレート（ＤＤＲｘ）メモリチャネルと、ＰＣＩｅリンクへの接続用のＰＣＩｅルートコンプレックスと、ＵＳＢネットワークへの接続用のＵＳＢコントローラと、ＳＡＴＡ（Serial Advanced Technology Attachment）大容量記憶デバイスへのインタフェースと、が含まれる。 FIG. 1 is a block diagram of a data processing system 100 according to some embodiments. Data processing system 100 generally includes a data processor 110 in the form of an accelerated processing unit (APU), a memory system 120, a peripheral component interconnect express (PCIe) system 150, and a universal serial bus (USB) system 160. And disk drive 170. Data processor 110 operates as the central processing unit (CPU) of data processing system 100 and provides various buses and interfaces useful in modern computer systems. These interfaces include two double data rate (DDRx) memory channels, a PCIe root complex for connection to a PCIe link, a USB controller for connection to a USB network, and a SATA (Serial Advanced Technology Attachment) high capacity And an interface to the storage device.

メモリシステム１２０は、メモリチャネル１３０と、メモリチャネル１４０と、を含む。メモリチャネル１３０は、本例において別々のランクに対応する代表的なＤＩＭＭ１３４，１３６，１３８を含む、ＤＤＲｘバス１３２に接続されたデュアルインラインメモリモジュール（ＤＩＭＭ）のセットを含む。同様に、メモリチャネル１４０は、代表的なＤＩＭＭ１４４，１４６，１４８を含む、ＤＤＲｘバス１４２に接続されたＤＩＭＭのセットを含む。 Memory system 120 includes a memory channel 130 and a memory channel 140. Memory channel 130 includes a set of dual in-line memory modules (DIMMs) connected to DDRx bus 132, including representative DIMMs 134, 136, 138 corresponding to different ranks in this example. Similarly, memory channel 140 includes a set of DIMMs connected to DDRx bus 142, including representative DIMMs 144, 146, 148.

ＰＣＩｅシステム１５０は、データプロセッサ１１０内のＰＣＩｅルートコンプレックスに接続されたＰＣＩｅスイッチ１５２と、ＰＣＩｅデバイス１５４と、ＰＣＩｅデバイス１５６と、ＰＣＩｅデバイス１５８と、を含む。ＰＣＩｅデバイス１５６は、システム基本入出力システム（ＢＩＯＳ）メモリ１５７に接続されている。システムＢＩＯＳメモリ１５７は、例えばリードオンリメモリ（ＲＯＭ）、フラッシュＥＥＰＲＯＭ（electrically erasable programmable ROM）等の様々な不揮発性メモリタイプの何れかであってもよい。 The PCIe system 150 includes a PCIe switch 152, a PCIe device 154, a PCIe device 156, and a PCIe device 158 connected to a PCIe root complex in the data processor 110. The PCIe device 156 is connected to a system basic input / output system (BIOS) memory 157. The system BIOS memory 157 may be any of various non-volatile memory types, such as read only memory (ROM), flash EEPROM (electrically erasable programmable ROM), and the like.

ＵＳＢシステム１６０は、データプロセッサ１１０内のＵＳＢマスタに接続されたＵＳＢハブ１６２と、ＵＳＢハブ１６２にそれぞれ接続された代表的なＵＳＢデバイス１６４，１６６，１６８と、を含む。ＵＳＢデバイス１６４，１６６，１６８は、例えばキーボード、マウス、フラッシュＥＥＰＲＯＭポート等のデバイスであってもよい。 The USB system 160 includes a USB hub 162 connected to a USB master in the data processor 110, and representative USB devices 164, 166, 168 connected to the USB hub 162, respectively. The USB devices 164, 166, and 168 may be devices such as a keyboard, a mouse, and a flash EEPROM port, for example.

ディスクドライブ１７０は、ＳＡＴＡバスを介してデータプロセッサ１１０に接続されており、オペレーティングシステム、アプリケーションプログラム、アプリケーションファイル等のための大容量ストレージを提供する。 The disk drive 170 is connected to the data processor 110 via a SATA bus, and provides a large capacity storage for an operating system, application programs, application files, and the like.

データ処理システム１００は、メモリチャネル１３０及びメモリチャネル１４０を提供することによって、最新のコンピューティングアプリケーションでの使用に適している。各メモリチャネル１３０，１４０は、例えばＤＤＲバージョン４（ＤＤＲ４）、低電力ＤＤＲ４（ＬＰＤＤＲ４）、グラフィックスＤＤＲバージョン５（ＧＤＤＲ５）及び高帯域幅メモリ（ＨＢＭ）等の最新のＤＤＲメモリに接続されてもよいし、将来のメモリ技術に適応されてもよい。これらのメモリは、高いバス帯域幅及び高速動作を提供する。同時に、これらは、ラップトップコンピュータ等のバッテリ駆動アプリケーションの電力を節約する低電力モードを提供し、組み込み型サーマルモニタリングも提供する。 Data processing system 100 is suitable for use in modern computing applications by providing memory channel 130 and memory channel 140. Each memory channel 130, 140 may be connected to a modern DDR memory such as DDR version 4 (DDR4), low power DDR4 (LPDDR4), graphics DDR version 5 (GDDR5) and high bandwidth memory (HBM), for example. It may be adapted to future memory technology. These memories provide high bus bandwidth and high speed operation. At the same time, they provide a low-power mode that saves power in battery-powered applications such as laptop computers and also provides embedded thermal monitoring.

図２は、図１のデータ処理システム１００での使用に適したＡＰＵ２００のブロック図である。ＡＰＵ２００は、概して、中央処理装置（ＣＰＵ）コアコンプレックス２１０と、グラフィックスコア２２０と、ディスプレイエンジン２３０のセットと、メモリ管理ハブ２４０と、データファブリック２５０と、周辺コントローラ２６０のセットと、周辺バスコントローラ２７０のセットと、システム管理ユニット（ＳＭＵ）２８０と、メモリコントローラ２９０のセットと、を含む。 FIG. 2 is a block diagram of an APU 200 suitable for use in the data processing system 100 of FIG. The APU 200 generally includes a central processing unit (CPU) core complex 210, a graphic score 220, a set of display engines 230, a memory management hub 240, a data fabric 250, a set of peripheral controllers 260, and a peripheral bus controller 270. , A system management unit (SMU) 280, and a set of memory controllers 290.

ＣＰＵコアコンプレックス２１０は、ＣＰＵコア２１２と、ＣＰＵコア２１４と、を含む。本例において、ＣＰＵコアコンプレックス２１０は２つのＣＰＵコアを含むが、他の実施形態では、ＣＰＵコアコンプレックス２１０は任意の数のＣＰＵコアを含んでもよい。各ＣＰＵコア２１２，２１４は、制御ファブリックを形成するシステム管理ネットワーク（ＳＭＮ）及びデータファブリック２５０に対して双方向に接続されており、メモリアクセス要求をデータファブリック２５０に提供することができる。各ＣＰＵコア２１２，２１４は、単一コアであってもよいし、例えばキャッシュ等の特定のリソースを共有する２つ以上の単一コアを有するコアコンプレックスであってもよい。 The CPU core complex 210 includes a CPU core 212 and a CPU core 214. In this example, the CPU core complex 210 includes two CPU cores, but in other embodiments, the CPU core complex 210 may include any number of CPU cores. Each of the CPU cores 212 and 214 is bidirectionally connected to the system management network (SMN) forming the control fabric and the data fabric 250, and can provide a memory access request to the data fabric 250. Each of the CPU cores 212 and 214 may be a single core or a core complex having two or more single cores sharing a specific resource such as a cache.

グラフィックスコア２２０は、例えば、頂点処理、フラグメント処理、シェーディング、テクスチャブレンド等のグラフィックス操作を、高度に統合された並列形式で実行することの可能な高性能グラフィックス処理ユニット（ＧＰＵ）である。グラフィックスコア２２０は、ＳＭＮ及びデータファブリック２５０に対して双方向に接続されており、メモリアクセス要求をデータファブリック２５０に提供することができる。これに関して、ＡＰＵ２００は、ＣＰＵコアコンプレックス２１０及びグラフィックスコア２２０が同じメモリ空間を共有するユニファイドメモリアーキテクチャ、又は、ＣＰＵコアコンプレックス２１０及びグラフィックスコア２２０がメモリ空間の一部を共有するメモリアーキテクチャをサポートしてもよいが、グラフィックスコア２２０は、ＣＰＵコアコンプレックス２１０がアクセスできない専用のグラフィックスメモリも使用する。 The graphic score 220 is a high-performance graphics processing unit (GPU) capable of executing graphics operations such as vertex processing, fragment processing, shading, and texture blending in a highly integrated parallel format. The graphic score 220 is bidirectionally connected to the SMN and the data fabric 250 and can provide a memory access request to the data fabric 250. In this regard, APU 200 supports a unified memory architecture in which CPU core complex 210 and graphic score 220 share the same memory space, or a memory architecture in which CPU core complex 210 and graphic score 220 share part of the memory space. However, the graphic score 220 also uses a dedicated graphics memory that cannot be accessed by the CPU core complex 210.

ディスプレイエンジン２３０は、グラフィックスコア２２０によって生成されたオブジェクトをレンダリング及びラスタライズして、モニタに表示する。グラフィックスコア２２０及びディスプレイエンジン２３０は、メモリシステム１２０の適切なアドレスに一様に変換されるために共通のメモリ管理ハブ２４０に対して双方向に接続されており、メモリ管理ハブ２４０は、かかるメモリアクセスを生成し、メモリシステムから返された読出しデータを受信するために、データファブリック２５０に対して双方向に接続されている。 The display engine 230 renders and rasterizes the object generated by the graphic score 220 and displays it on the monitor. The graphic score 220 and the display engine 230 are bi-directionally connected to a common memory management hub 240 in order to be uniformly translated to the appropriate address of the memory system 120, and the memory management hub 240 is connected to such memory. Bidirectionally connected to the data fabric 250 to generate access and receive read data returned from the memory system.

データファブリック２５０は、任意のメモリアクセスエージェントとメモリコントローラ２９０との間でメモリアクセス要求及びメモリ応答をルーティングするためのクロスバースイッチを含む。また、データファブリック２５０は、システム構成に基づくメモリアクセスの宛先と、仮想接続毎のバッファとを判断するためのシステムメモリマップであって、ＢＩＯＳによって定義されたシステムメモリマップを含む。 Data fabric 250 includes a crossbar switch for routing memory access requests and memory responses between any memory access agent and memory controller 290. The data fabric 250 is a system memory map for determining a memory access destination based on the system configuration and a buffer for each virtual connection, and includes a system memory map defined by the BIOS.

周辺コントローラ２６０は、ＵＳＢコントローラ２６２と、ＳＡＴＡインタフェースコントローラ２６４と、を含み、これらの各々が、システムハブ２６６及びＳＭＮバスに対して双方向に接続されている。これらの２つのコントローラは、ＡＰＵ２００で使用可能な周辺コントローラの単なる例示である。 The peripheral controller 260 includes a USB controller 262 and a SATA interface controller 264, each of which is connected bi-directionally to the system hub 266 and the SMN bus. These two controllers are merely examples of peripheral controllers that can be used with the APU 200.

周辺バスコントローラ２７０は、システムコントローラ（即ち「サウスブリッジ」（ＳＢ））２７２と、ＰＣＩｅコントローラ２７４と、を含み、これらの各々が、入出力（Ｉ／Ｏ）ハブ２７６及びＳＭＮバスに対して双方向に接続されている。また、Ｉ／Ｏハブ２７６は、システムハブ２６６及びデータファブリック２５０に対して双方向に接続されている。したがって、例えば、ＣＰＵコアは、データファブリック２５０がＩ／Ｏハブ２７６を介してルーティングするアクセスを通じて、ＵＳＢコントローラ２６２、ＳＡＴＡインタフェースコントローラ２６４、ＳＢ２７２、又は、ＰＣＩｅコントローラ２７４内のレジスタをプログラムすることができる。 Peripheral bus controller 270 includes a system controller (or “South Bridge” (SB)) 272 and a PCIe controller 274, each of which is both for the input / output (I / O) hub 276 and the SMN bus. Connected in the opposite direction. Further, the I / O hub 276 is bidirectionally connected to the system hub 266 and the data fabric 250. Thus, for example, the CPU core can program the registers in the USB controller 262, SATA interface controller 264, SB272, or PCIe controller 274 through accesses that the data fabric 250 routes through the I / O hub 276. .

ＳＭＵ２８０は、ＡＰＵ２００上のリソースの動作を制御し、それらの間の通信を同期させるローカルコントローラである。ＳＭＵ２８０は、ＡＰＵ２００上の様々なプロセッサのパワーアップシーケンシングを管理し、リセット、イネーブル及び他の信号を介して複数のオフチップデバイスを制御する。ＳＭＵ２８０は、ＡＰＵ２００の各コンポーネントにクロック信号を提供するために、図２に示されていない１つ以上のクロック源（例えば位相同期ループ（ＰＬＬ）等）を含む。また、ＳＭＵ２８０は、様々なプロセッサ及び他の機能ブロックの電力を管理し、ＣＰＵコア２１２，２１４及びグラフィックスコア２２０から測定された電力消費値を受信して、適切な電力状態を判断してもよい。 The SMU 280 is a local controller that controls the operation of resources on the APU 200 and synchronizes communication between them. The SMU 280 manages the power up sequencing of various processors on the APU 200 and controls multiple off-chip devices via reset, enable and other signals. SMU 280 includes one or more clock sources (eg, a phase locked loop (PLL), etc.) not shown in FIG. 2 to provide clock signals to the components of APU 200. The SMU 280 may also manage the power of various processors and other functional blocks and receive power consumption values measured from the CPU cores 212 and 214 and the graphic score 220 to determine an appropriate power state. .

また、ＡＰＵ２００は、様々なシステムモニタリング及び省電力機能を実装する。特に、１つのシステムモニタリング機能は、サーマルモニタリングである。例えば、ＳＭＵ２８０は、ＡＰＵ２００が高温になると、ＣＰＵコア２１２，２１４及び／又はグラフィックスコア２２０の周波数及び電圧を低減させてもよい。ＡＰＵ２００が非常に高温になった場合には、ＡＰＵ２００が完全にシャットダウンされてもよい。サーマルイベントは、ＳＭＵ２８０によって、外部センサからＳＭＮバスを介して受信されてもよく、ＳＭＵ２８０は、これに応じてクロック周波数及び／又は電源電圧を低下させてもよい。 The APU 200 also implements various system monitoring and power saving functions. In particular, one system monitoring function is thermal monitoring. For example, the SMU 280 may reduce the frequency and voltage of the CPU cores 212, 214 and / or the graphic score 220 when the APU 200 becomes hot. If the APU 200 becomes very hot, the APU 200 may be completely shut down. The thermal event may be received by the SMU 280 from an external sensor via the SMN bus, and the SMU 280 may decrease the clock frequency and / or the power supply voltage accordingly.

図３は、いくつかの実施形態による、図２のＡＰＵ２００での使用に適したメモリコントローラ３００及び関連する物理インタフェース（ＰＨＹ）３３０のブロック図である。メモリコントローラ３００は、メモリチャネル３１０と、電力エンジン３２０と、を含む。メモリチャネル３１０は、ホストインタフェース３１２と、メモリチャネルコントローラ３１４と、物理インタフェース３１６と、を含む。ホストインタフェース３１２は、メモリチャネルコントローラ３１４を、スケーラブルデータポート（ＳＤＰ）を介してデータファブリック２５０に双方向に接続する。物理インタフェース３１６は、メモリチャネルコントローラ３１４を、ＤＤＲ−ＰＨＹインタフェース仕様（ＤＦＩ）に準拠するバスを介してＰＨＹ３３０に双方向に接続する。電力エンジン３２０は、ＳＭＮバスを介してＳＭＵ２８０に双方向に接続されており、ＡＰＢ（Advanced Peripheral Bus）を介してＰＨＹ３３０に双方向に接続されており、メモリチャネルコントローラ３１４にも双方向に接続されている。ＰＨＹ３３０は、例えば図１のメモリチャネル１３０又はメモリチャネル１４０等のメモリチャネルに対する双方向接続を有する。メモリコントローラ３００は、単一のメモリチャネルコントローラ３１４を使用した単一のメモリチャネル用のメモリコントローラの例示であり、以下にさらに説明するメモリチャネルコントローラ３１４の動作を制御するための電力エンジン３２０を有する。 FIG. 3 is a block diagram of a memory controller 300 and associated physical interface (PHY) 330 suitable for use with the APU 200 of FIG. 2 according to some embodiments. The memory controller 300 includes a memory channel 310 and a power engine 320. The memory channel 310 includes a host interface 312, a memory channel controller 314, and a physical interface 316. The host interface 312 bi-directionally connects the memory channel controller 314 to the data fabric 250 via a scalable data port (SDP). The physical interface 316 bi-directionally connects the memory channel controller 314 to the PHY 330 via a bus conforming to the DDR-PHY interface specification (DFI). The power engine 320 is bidirectionally connected to the SMU 280 via the SMN bus, is bidirectionally connected to the PHY 330 via the APB (Advanced Peripheral Bus), and is also bidirectionally connected to the memory channel controller 314. ing. The PHY 330 has a bi-directional connection to a memory channel, such as the memory channel 130 or memory channel 140 of FIG. Memory controller 300 is an illustration of a memory controller for a single memory channel using a single memory channel controller 314 and has a power engine 320 for controlling the operation of the memory channel controller 314 described further below. .

図４は、いくつかの実施形態による、図２のＡＰＵ２００での使用に適した別のメモリコントローラ４００及び関連するＰＨＹ４４０，４５０のブロック図である。メモリコントローラ４００は、メモリチャネル４１０，４２０と、電力エンジン４３０と、を含む。メモリチャネル４１０は、ホストインタフェース４１２と、メモリチャネルコントローラ４１４と、物理インタフェース４１６と、を含む。ホストインタフェース４１２は、メモリチャネルコントローラ４１４を、ＳＤＰを介してデータファブリック２５０に双方向に接続する。物理インタフェース４１６は、ＤＦＩ仕様に準拠しており、メモリチャネルコントローラ４１４をＰＨＹ４４０に双方向に接続する。メモリチャネル４２０は、ホストインタフェース４２２と、メモリチャネルコントローラ４２４と、物理インタフェース４２６と、を含む。ホストインタフェース４２２は、メモリチャネルコントローラ４２４を、別のＳＤＰを介してデータファブリック２５０に双方向に接続する。物理インタフェース４２６は、ＤＦＩ仕様に準拠しており、メモリチャネルコントローラ４２４をＰＨＹ４５０に双方向に接続する。電力エンジン４３０は、ＳＭＮバスを介してＳＭＵ２８０に双方向に接続されており、ＡＰＢを介してＰＨＹ４４０，４５０に双方向に接続されており、メモリチャネルコントローラ４１４，４２４にも双方向に接続されている。ＰＨＹ４４０は、例えば図１のメモリチャネル１３０等のメモリチャネルに対する双方向接続を有する。ＰＨＹ４５０は、例えば図１のメモリチャネル１４０等のメモリチャネルに対する双方向接続を有する。メモリコントローラ４００は、２つのメモリチャネルコントローラを有するメモリコントローラの例示であり、共有の電力エンジン４３０を使用して、以下にさらに説明するように、メモリチャネルコントローラ４１４及びメモリチャネルコントローラ４２４の各々の動作を制御する。 FIG. 4 is a block diagram of another memory controller 400 and associated PHYs 440, 450 suitable for use with the APU 200 of FIG. 2, according to some embodiments. The memory controller 400 includes memory channels 410 and 420 and a power engine 430. The memory channel 410 includes a host interface 412, a memory channel controller 414, and a physical interface 416. The host interface 412 bi-directionally connects the memory channel controller 414 to the data fabric 250 via SDP. The physical interface 416 conforms to the DFI specification and connects the memory channel controller 414 to the PHY 440 bidirectionally. The memory channel 420 includes a host interface 422, a memory channel controller 424, and a physical interface 426. The host interface 422 bi-directionally connects the memory channel controller 424 to the data fabric 250 via another SDP. The physical interface 426 conforms to the DFI specification and connects the memory channel controller 424 to the PHY 450 bidirectionally. The power engine 430 is bi-directionally connected to the SMU 280 via the SMN bus, bi-directionally connected to the PHYs 440 and 450 via the APB, and bi-directionally connected to the memory channel controllers 414 and 424. Yes. The PHY 440 has a bi-directional connection to a memory channel, such as the memory channel 130 of FIG. The PHY 450 has a bi-directional connection to a memory channel, such as the memory channel 140 of FIG. Memory controller 400 is an example of a memory controller having two memory channel controllers and uses a shared power engine 430 to operate each of memory channel controller 414 and memory channel controller 424, as described further below. To control.

図５は、いくつかの実施形態による、メモリコントローラ５００のブロック図である。メモリコントローラ５００は、メモリチャネルコントローラ５１０と、電力コントローラ５５０と、を含む。メモリチャネルコントローラ５１０は、インタフェース５１２と、キュー５１４と、コマンドキュー５２０と、アドレス生成器５２２と、コンテンツアドレス可能メモリ（ＣＡＭ）５２４と、再生キュー５３０と、リフレッシュロジックブロック５３２と、タイミングブロック５３４と、ページテーブル５３６と、アービタ５３８と、エラー訂正コード（ＥＣＣ）チェックブロック５４２と、ＥＣＣ生成ブロック５４４と、データバッファ（ＤＢ）５４６と、を含む。 FIG. 5 is a block diagram of a memory controller 500 according to some embodiments. Memory controller 500 includes a memory channel controller 510 and a power controller 550. The memory channel controller 510 includes an interface 512, a queue 514, a command queue 520, an address generator 522, a content addressable memory (CAM) 524, a playback queue 530, a refresh logic block 532, and a timing block 534. A page table 536, an arbiter 538, an error correction code (ECC) check block 542, an ECC generation block 544, and a data buffer (DB) 546.

インタフェース５１２は、外部バスを介したデータファブリック２５０との第１双方向接続と、出力と、を有する。メモリコントローラ５００において、この外部バスは、「ＡＸＩ４」として知られている、英国ケンブリッジのＡＲＭＨｏｌｄｉｎｇｓ，ＰＬＣによって仕様化されたアドバンストエクステンシブルインタフェースバージョン４と互換性があるが、他の実施形態では、他のタイプのインタフェースであってもよい。インタフェース５１２は、ＦＣＬＫ（又はＭＥＭＣＬＫ）ドメインとして知られる第１クロックドメインから、ＵＣＬＫドメインとして知られるメモリコントローラ５００の内部の第２クロックドメインへのメモリアクセス要求を変換する。同様に、キュー５１４は、ＵＣＬＫドメインから、ＤＦＩインタフェースに関連するＤＦＩＣＬＫドメインへのメモリアクセスを提供する。 The interface 512 has a first bidirectional connection with the data fabric 250 via an external bus and an output. In the memory controller 500, this external bus is compatible with Advanced Extensible Interface version 4 specified by ARM Holdings, PLC of Cambridge, UK, known as “AXI4”, but in other embodiments, Other types of interfaces may be used. Interface 512 translates memory access requests from a first clock domain, known as the FCLK (or MEMCLK) domain, to a second clock domain within the memory controller 500, known as the UCLK domain. Similarly, queue 514 provides memory access from the UCLK domain to the DFICLK domain associated with the DFI interface.

アドレス生成器５２２は、データファブリック２５０からＡＸＩ４バスを介して受信したメモリアクセス要求のアドレスを復号化する。メモリアクセス要求は、正規化されたアドレスとして表わされる物理アドレス空間内のアクセスアドレスを含む。アドレス生成器５２２は、正規化されたアドレスを、メモリシステム１２０内の実際のメモリデバイスをアドレス指定し、関連するアクセスを効率的にスケジュールするのに使用可能なフォーマットに変換する。このフォーマットは、メモリアクセス要求を特定のランク、行アドレス、列アドレス、バンクアドレス及びバンクグループに関連付ける領域識別子を含む。システムＢＩＯＳは、起動時に、メモリシステム１２０内のメモリデバイスにクエリしてそのサイズ及び構成を判断し、アドレス生成器５２２に関連する構成レジスタのセットをプログラムする。アドレス生成器５２２は、構成レジスタに記憶された構成を使用して、正規化されたアドレスを適切なフォーマットに変換する。コマンドキュー５２０は、データ処理システム１００内のメモリアクセスエージェント（例えば、ＣＰＵコア２１２，２１４及びグラフィックスコア２２０等）から受信したメモリアクセス要求のキューである。コマンドキュー５２０は、アドレス生成器５２２によって復号化されたアドレスフィールドと、アクセスタイプ及びサービス品質（ＱｏＳ）識別子を含むメモリアクセスをアービタ５３８が効率的に選択するのを可能にする他のアドレス情報と、を記憶する。ＣＡＭ５２４は、例えばライトアフターライト（ＷＡＷ）及びリードアフターライト（ＲＡＷ）順序付けルール等の順序付けルールを実施するための情報を含む。 The address generator 522 decodes the address of the memory access request received from the data fabric 250 via the AXI4 bus. The memory access request includes an access address in the physical address space represented as a normalized address. The address generator 522 converts the normalized address into a format that can be used to address the actual memory device in the memory system 120 and efficiently schedule the associated access. This format includes a region identifier that associates a memory access request with a particular rank, row address, column address, bank address, and bank group. Upon startup, the system BIOS queries memory devices in the memory system 120 to determine their size and configuration and programs a set of configuration registers associated with the address generator 522. The address generator 522 uses the configuration stored in the configuration register to convert the normalized address into an appropriate format. The command queue 520 is a queue for memory access requests received from memory access agents (for example, the CPU cores 212 and 214 and the graphic score 220) in the data processing system 100. Command queue 520 includes an address field decoded by address generator 522 and other address information that enables arbiter 538 to efficiently select a memory access including an access type and quality of service (QoS) identifier. , Remember. The CAM 524 includes information for implementing ordering rules such as write after write (WAW) and read after write (RAW) ordering rules.

再生キュー５３０は、例えば、アドレス及びコマンドパリティ応答、ＤＤＲ４ＤＲＡＭの書込み巡回冗長検査（ＣＲＣ）応答、又は、ＧＤＤＲ５ＤＲＡＭの書込み及び読出しＣＲＣ応答等の応答を待つアービタ５３８によって取り出されたメモリアクセスを記憶するための一時的なキューである。再生キュー５３０は、ＥＣＣチェックブロック５４２にアクセスして、返されたＥＣＣが正しいか否か又はエラーを示しているか否かを判別する。再生キュー５３０は、何れかのサイクルでパリティ又はＣＲＣエラーの場合にアクセスが再生されるのを可能にする。 The replay queue 530 stores memory accesses retrieved by the arbiter 538 waiting for responses such as address and command parity responses, DDR4 DRAM write cyclic redundancy check (CRC) responses, or GDDR5 DRAM write and read CRC responses, for example. It is a temporary queue to do. The replay queue 530 accesses the ECC check block 542 to determine whether the returned ECC is correct or indicates an error. The replay queue 530 allows access to be reclaimed in the event of a parity or CRC error in any cycle.

リフレッシュロジック５３２は、メモリアクセスエージェントから受信した通常の読出し及び書込みメモリアクセス要求とは別に生成される様々なパワーダウン、リフレッシュ、及び、終端抵抗（ＺＱ）較正サイクルのためのステートマシンを含む。例えば、メモリランクがプリチャージパワーダウンにある場合には、リフレッシュサイクルを実行するために定期的に起動されなければならない。リフレッシュロジック５３２は、オートリフレッシュコマンドを定期的に生成して、ＤＲＡＭチップ内のメモリセルのチャージオフストレージキャパシタのリークによって生じるデータエラーを防止する。さらに、リフレッシュロジック５３２は、ＺＱを定期的に較正して、システム内の熱変化によるオンダイ終端抵抗のミスマッチを防止する。また、リフレッシュロジック５３２は、どの場合にＤＲＡＭデバイスを別のパワーダウンモードにするのかを決定する。 The refresh logic 532 includes a state machine for various power down, refresh, and termination resistor (ZQ) calibration cycles that are generated separately from the normal read and write memory access requests received from the memory access agent. For example, if the memory rank is at precharge power down, it must be periodically activated to perform a refresh cycle. The refresh logic 532 periodically generates an auto-refresh command to prevent data errors caused by leakage of charge-off storage capacitors of memory cells in the DRAM chip. In addition, refresh logic 532 periodically calibrates ZQ to prevent on-die termination resistor mismatch due to thermal changes in the system. The refresh logic 532 also determines when to place the DRAM device in another power-down mode.

アービタ５３８は、コマンドキュー５２０に双方向に接続されており、メモリチャネルコントローラ５１０の中心部分である。アービタ５３８は、メモリバスの利用を改善するために、インテリジェントなアクセススケジューリングによって効率を改善する。アービタ５３８は、タイミングブロック５３４を使用して、コマンドキュー５２０内の特定のアクセスの発行に適しているか否かをＤＲＡＭタイミングパラメータに基づいて判断することによって、適切なタイミング関係を実施する。例えば、各ＤＲＡＭは、同じバンクへの起動コマンド間の最小指定時間（「ｔ_ＲＣ」として知られる）を有する。タイミングブロック５３４は、再生キュー５３０に双方向に接続されており、このタイミングパラメータ及びＪＥＤＥＣ仕様で指定された他のタイミングパラメータに基づいて適格性を判断するカウンタのセットを維持する。ページテーブル５３６は、再生キュー５３０に双方向に接続されており、アービタ５３８のメモリチャネルの各バンク及びランクのアクティブページに関する状態情報を維持する。 The arbiter 538 is bidirectionally connected to the command queue 520 and is the central part of the memory channel controller 510. Arbiter 538 improves efficiency through intelligent access scheduling to improve memory bus utilization. Arbiter 538 uses timing block 534 to implement the appropriate timing relationship by determining whether it is suitable for issuing a particular access in command queue 520 based on DRAM timing parameters. For example, each DRAM has a minimum specified time (known as “t _RC ”) between activation commands to the same bank. Timing block 534 is bi-directionally connected to play queue 530 and maintains a set of counters that determine eligibility based on this timing parameter and other timing parameters specified in the JEDEC specification. The page table 536 is bi-directionally connected to the play queue 530 and maintains state information regarding the active page of each bank and rank of the memory channel of the arbiter 538.

ＥＣＣ生成ブロック５４４は、インタフェース５１２から受信した書込みメモリアクセス要求に応じて、書込みデータに従ってＥＣＣを計算する。ＤＢ５４６は、受信したメモリアクセス要求の書込みデータ及びＥＣＣを記憶する。アービタ５３８が、メモリチャネルにディスパッチするための対応する書込みアクセスを選ぶと、ＤＢ５４６は、結合した書込みデータ／ＥＣＣをキュー５１４に出力する。 The ECC generation block 544 calculates an ECC according to the write data in response to the write memory access request received from the interface 512. The DB 546 stores write data and ECC of the received memory access request. When arbiter 538 selects the corresponding write access to dispatch to the memory channel, DB 546 outputs the combined write data / ECC to queue 514.

電力コントローラ５５０は、アドバンストエクテンシブルインタフェースバージョン１（ＡＸＩ）へのインタフェース５５２と、ＡＰＢインタフェース５５４と、電力エンジン５６０と、を含む。インタフェース５５２は、ＳＭＮへの第１双方向接続であって、図５に別に示された「イベント＿ｎ」と付されたイベント信号を受信するための入力を含む第１双方向接続と、出力と、を含む。ＡＰＢインタフェース５５４は、インタフェース５５２の出力に接続された入力と、ＡＰＢを介してＰＨＹに接続するための出力と、を有する。電力エンジン５６０は、インタフェース５５２の出力に接続された入力と、キュー５１４の入力に接続された出力と、を有する。電力エンジン５６０は、構成レジスタ５６２のセットと、マイクロコントローラ（μＣ）５６４と、セルフリフレッシュコントローラ（ＳＬＦＲＥＦ／ＰＥ）５６６と、信頼性のある読出し／書込みトレーニングエンジン（ＲＲＷ／ＴＥ）５６８と、を含む。構成レジスタ５６２は、ＡＸＩバスを介してプログラムされており、メモリコントローラ５００内の様々なブロックの動作を制御するための構成情報を記憶する。したがって、構成レジスタ５６２は、図５に詳細に示されていないこれらのブロックに接続された出力を有する。セルフリフレッシュコントローラ５６６は、リフレッシュロジック５３２によるリフレッシュの自動生成に加えて、リフレッシュの手動生成を可能にするエンジンである。信頼性のある読出し／書込みトレーニングエンジン５６８は、ＤＤＲインタフェース読出しレイテンシトレーニング及びループバックテスト等の目的のために、連続的なメモリアクセスストリームをメモリ又はＩ／Ｏデバイスに提供する。 The power controller 550 includes an interface 552 to Advanced Extensible Interface Version 1 (AXI), an APB interface 554, and a power engine 560. The interface 552 is a first bidirectional connection to the SMN, including a first bidirectional connection including an input for receiving an event signal labeled “Event_n” shown separately in FIG. ,including. APB interface 554 has an input connected to the output of interface 552 and an output for connection to PHY via APB. The power engine 560 has an input connected to the output of the interface 552 and an output connected to the input of the queue 514. The power engine 560 includes a set of configuration registers 562, a microcontroller (μC) 564, a self-refresh controller (SLFREF / PE) 566, and a reliable read / write training engine (RRW / TE) 568. . Configuration register 562 is programmed via the AXI bus and stores configuration information for controlling the operation of various blocks within memory controller 500. Thus, configuration register 562 has outputs connected to these blocks not shown in detail in FIG. The self-refresh controller 566 is an engine that enables manual generation of refresh in addition to automatic generation of refresh by the refresh logic 532. A reliable read / write training engine 568 provides a continuous memory access stream to the memory or I / O device for purposes such as DDR interface read latency training and loopback testing.

メモリチャネルコントローラ５１０は、関連するメモリチャネルへのディスパッチのためにメモリアクセスを選択することを可能にする回路を含む。アドレス生成器５２２は、所望のアービトレーションの決定を行うために、アドレス情報を、メモリシステム内のランク、行アドレス、列アドレス、バンクアドレス及びバンクグループを含むプリデコードされた情報に復号化し、コマンドキュー５２０は、プリデコードされた情報を記憶する。構成レジスタ５６２は、受信したアドレス情報をアドレス生成器５２２がどのように復号するのかを決定するために、構成情報を記憶する。アービタ５３８は、復号化されたアドレス情報と、タイミングブロック５３４によって示されたタイミング適格性情報と、ページテーブル５３６によって示されたアクティブページ情報と、を使用して、例えばＱｏＳ要件等の他の基準を遵守しながらメモリアクセスを効率的にスケジューリングする。例えば、アービタ５３８は、メモリページを変更するのに必要なプリチャージ及びアクティブ化コマンドのオーバヘッドを避けるために、オープンページへのアクセスを優先し、或るバンクへのオーバヘッドアクセスを、他のバンクへの読出し及び書込みアクセスをインタリーブすることによって隠す。特に、アービタ５３８は、通常動作中に、異なるページを選択する前にプリチャージされる必要があるまで、異なるバンク内のページオープンを維持することを決定してもよい。 Memory channel controller 510 includes circuitry that enables selecting memory access for dispatch to the associated memory channel. The address generator 522 decodes the address information into pre-decoded information including rank, row address, column address, bank address and bank group in the memory system to make a desired arbitration decision, 520 stores the predecoded information. Configuration register 562 stores configuration information to determine how address generator 522 decodes the received address information. The arbiter 538 uses the decoded address information, the timing eligibility information indicated by the timing block 534, and the active page information indicated by the page table 536 to use other criteria such as QoS requirements, for example. To efficiently schedule memory access. For example, the arbiter 538 prioritizes access to an open page and avoids overhead access to one bank to another bank to avoid the precharge and activation command overhead required to change the memory page. Hide by interleaving read and write access. In particular, the arbiter 538 may decide to maintain a page open in a different bank during normal operation until it needs to be precharged before selecting a different page.

図６は、いくつかの実施形態による、図５のメモリコントローラ５００の一部６００のブロック図である。この一部６００は、アービタ５３８と、アービタ５３８の動作に関連する制御回路６６０のセットと、を含む。アービタ５３８は、サブアービタ６０５のセットと、最終アービタ６５０と、を含む。サブアービタ６０５は、サブアービタ６１０と、サブアービタ６２０と、サブアービタ６３０と、を含む。サブアービタ６１０は、「ＰＨＡＲＢ」と付されたページヒットアービタ６１２と、出力レジスタ６１４と、を含む。ページヒットアービタ６１２は、コマンドキュー５２０に接続された第１入力及び第２入力と、出力と、を有する。レジスタ６１４は、ページヒットアービタ６１２の出力に接続されたデータ入力と、ＵＣＬＫ信号を受信するためのクロック入力と、出力と、を有する。サブアービタ６２０は、「ＰＣＡＲＢ」と付されたページ競合アービタ６２２と、出力レジスタ６２４と、を含む。ページ競合アービタ６２２は、コマンドキュー５２０に接続された第１入力及び第２入力と、出力と、を有する。レジスタ６２４は、ページ競合アービタ６２２の出力に接続されたデータ入力と、ＵＣＬＫ信号を受信するためのクロック入力と、出力と、を有する。サブアービタ６３０は、「ＰＭＡＲＢ」と付されたページミスアービタ６３２と、出力レジスタ６３４と、を含む。ページミスアービタ６３２は、コマンドキュー５２０に接続された第１入力及び第２入力と、出力と、を有する。レジスタ６３４は、ページミスアービタ６３２の出力に接続されたデータ入力と、ＵＣＬＫ信号を受信するためのクロック入力と、出力と、を有する。最終アービタ６５０は、リフレッシュロジック５３２の出力に接続された第１入力と、ページクローズプレディクタ６６２からの第２入力と、出力レジスタ６１４の出力に接続された第３入力と、出力レジスタ６２４の出力に接続された第４入力と、出力レジスタ６３４の出力に接続された第５入力と、「ＣＭＤ１」と付された第１出力であって、第１アービトレーション勝者をキュー５１４に提供するための第１出力と、「ＣＭＤ２」と付された第２出力であって、第２アービトレーション勝者をキュー５１４に提供するための第２出力と、を有する。 FIG. 6 is a block diagram of a portion 600 of the memory controller 500 of FIG. 5 according to some embodiments. This portion 600 includes an arbiter 538 and a set of control circuits 660 related to the operation of the arbiter 538. Arbiter 538 includes a set of sub-arbiters 605 and a final arbiter 650. The sub arbiter 605 includes a sub arbiter 610, a sub arbiter 620, and a sub arbiter 630. Sub-arbiter 610 includes a page hit arbiter 612 labeled “PH ARB” and an output register 614. The page hit arbiter 612 has a first input and a second input connected to the command queue 520, and an output. Register 614 has a data input connected to the output of page hit arbiter 612, a clock input for receiving the UCLK signal, and an output. Sub-arbiter 620 includes a page contention arbiter 622 labeled “PC ARB” and an output register 624. The page contention arbiter 622 has a first input and a second input connected to the command queue 520, and an output. Register 624 has a data input connected to the output of page contention arbiter 622, a clock input for receiving the UCLK signal, and an output. Sub-arbiter 630 includes a page miss arbiter 632 labeled “PM ARB” and an output register 634. The page miss arbiter 632 has a first input and a second input connected to the command queue 520, and an output. Register 634 has a data input connected to the output of page miss arbiter 632, a clock input for receiving the UCLK signal, and an output. The final arbiter 650 is connected to the first input connected to the output of the refresh logic 532, the second input from the page close predictor 662, the third input connected to the output of the output register 614, and the output of the output register 624. A fourth input connected, a fifth input connected to the output of output register 634, and a first output labeled "CMD1", the first for providing the first arbitration winner to queue 514; And a second output labeled “CMD2” for providing a second arbitration winner to the queue 514.

制御回路６６０は、図５に関して上述したように、タイミングブロック５３４と、ページテーブル５３６と、ページクローズプレディクタ６６２と、を含む。タイミングブロック５３４は、入力と、ページヒットアービタ６１２、ページ競合アービタ６２２及びページミスアービタ６３２の各々の第１入力に接続された出力と、を有する。ページテーブル５３４は、再生キュー５３０の出力に接続された入力と、再生キュー５３０の入力に接続された出力と、コマンドキュー５２０の入力に接続された出力と、タイミングブロック５３４の入力に接続された出力と、ページクローズプレディクタ６６２の入力に接続された出力と、を有する。ページクローズプレディクタ６６２は、ページテーブル５３６の１つの出力に接続された入力と、出力レジスタ６１４の出力に接続された入力と、最終アービタ６５０の第２入力に接続された出力と、を有する。 The control circuit 660 includes a timing block 534, a page table 536, and a page close predictor 662, as described above with respect to FIG. Timing block 534 has an input and an output connected to a first input of each of page hit arbiter 612, page conflict arbiter 622, and page miss arbiter 632. The page table 534 is connected to the input connected to the output of the playback queue 530, the output connected to the input of the playback queue 530, the output connected to the input of the command queue 520, and the input of the timing block 534. And an output connected to the input of the page close predictor 662. Page close predictor 662 has an input connected to one output of page table 536, an input connected to the output of output register 614, and an output connected to the second input of final arbiter 650.

アービタ５３８は、動作中、各エントリのページ状態、各メモリアクセス要求の優先度、及び、要求間の依存関係を考慮することによって、メモリアクセス要求（コマンド）をコマンドキュー５２０及びリフレッシュロジック５３２から選択する。優先度は、ＡＸＩ４バスから受信されコマンドキュー５２０に記憶された要求のサービス品質（即ちＱｏＳ）に関連するが、メモリアクセスのタイプ、及び、アービタ５３８のダイナミック動作に基づいて変更され得る。アービタ５３８は、既存の集積回路技術の処理制限と伝送制限との間の不整合に対処するために並行して動作する３つのサブアービタを含む。各サブアービトレーションの勝者は、最終アービタ６５０に提示される。最終アービタ６５０は、これらの３つのサブアービトレーション勝者のうち何れかを、リフレッシュロジック５３２からのリフレッシュ動作と同様に選択し、読出し又は書込みコマンドを、ページクローズプレディクタ６６２によって決定された自動プリチャージ付き読出し又は書込みコマンドにさらに変更してもよい。 During operation, the arbiter 538 selects a memory access request (command) from the command queue 520 and the refresh logic 532 by considering the page state of each entry, the priority of each memory access request, and the dependency between requests. To do. The priority is related to the quality of service (ie, QoS) of the request received from the AXI4 bus and stored in the command queue 520, but may be changed based on the type of memory access and the dynamic operation of the arbiter 538. Arbiter 538 includes three sub-arbiters that operate in parallel to address inconsistencies between processing limitations and transmission limitations of existing integrated circuit technology. The winner of each sub-arbitration is presented to the final arbiter 650. The final arbiter 650 selects any of these three sub-arbitration winners in the same manner as the refresh operation from the refresh logic 532, and the read or write command is read with automatic precharge determined by the page close predictor 662. Or you may change into a write command further.

ページヒットアービタ６１２、ページ競合アービタ６２２及びページミスアービタ６３２の各々は、タイミングブロック５３４の出力に接続された入力を有しており、これらの各々のカテゴリに入るコマンドキュー５２０内のコマンドのタイミング適格性を判断する。タイミングブロック５３４は、各ランクの各バンクの特定の動作に関連する期間をカウントするバイナリカウンタのアレイを含む。状態を判断するのに必要なタイマの数は、タイミングパラメータ、所定のメモリタイプのバンク数、及び、所定のメモリチャネル上のシステムによってサポートされるランク数に依存する。次に、順番に実装されるタイミングパラメータの数は、システムに実装されるメモリのタイプに依存する。例えば、ＧＤＤＲ５メモリは、他のＤＤＲｘメモリタイプよりも多くのタイミングパラメータに対応するために、より多くのタイマを必要とする。タイミングブロック５３４は、バイナリカウンタとして実装されたジェネリックタイマのアレイを含むことによって、異なるメモリタイプに対して調整され、再利用され得る。 Each of page hit arbiter 612, page contention arbiter 622, and page miss arbiter 632 has an input connected to the output of timing block 534, and the timing eligibility of commands in command queue 520 that fall into each of these categories. Judging sex. Timing block 534 includes an array of binary counters that count the time period associated with a particular operation in each bank of each rank. The number of timers required to determine the state depends on the timing parameters, the number of banks of a given memory type, and the number of ranks supported by the system on a given memory channel. Next, the number of timing parameters that are implemented in turn depends on the type of memory that is implemented in the system. For example, GDDR5 memory requires more timers to accommodate more timing parameters than other DDRx memory types. Timing block 534 can be adjusted and reused for different memory types by including an array of generic timers implemented as a binary counter.

ページヒットは、オープンページに対する読出し又は書込みサイクルである。ページヒットアービタ６１２は、オープンページに対するコマンドキュー５２０内のアクセス間のアービトレーションを行う。タイミングブロック５３４内のタイマによって追跡され、ページヒットアービタ６１２によってチェックされるタイミング適格性パラメータは、例えば、列アドレスストローブ（ＣＡＳ）に対する行アドレスストローブ（ＲＡＳ）の遅延時間（ｔ_ＲＣＤ）及びＣＡＳレイテンシ（ｔ_ＣＬ）を含む。例えば、ｔ_ＲＣＤは、ＲＡＳサイクルでページが開かれた後に当該ページに読出し又は書込みアクセスする前に経過する必要がある最小時間を指定する。ページヒットアービタ６１２は、アクセスの割り当てられた優先度に基づいて、サブアービトレーション勝者を選択する。一実施形態では、優先度は４ビットのワンホット値であり、４つの値の中で優先度を示しているが、この４つのレベルの優先度スキームが単なる一例に過ぎないことは明らかである。ページヒットアービタ６１２が同じ優先度レベルで２つ以上の要求を検出した場合、最も古いエントリが勝者となる。 A page hit is a read or write cycle for an open page. The page hit arbiter 612 performs arbitration between accesses in the command queue 520 for open pages. The timing eligibility parameters tracked by the timer in timing block 534 and checked by page hit arbiter 612 include, for example, the delay time (t _RCD ) of row address strobe (RAS) to column address strobe (CAS) and CAS latency ( t _CL ). For example, t _RCD specifies the minimum time that must elapse after a page is opened in a RAS cycle and before read or write access to the page. The page hit arbiter 612 selects a sub-arbitration winner based on the assigned priority of access. In one embodiment, the priority is a 4-bit one-hot value, indicating the priority among the four values, but it is clear that this four-level priority scheme is only an example. . If the page hit arbiter 612 detects two or more requests with the same priority level, the oldest entry becomes the winner.

ページ競合は、バンク内の他の行が現在アクティブ化されているときの当該バンク内の或る行へのアクセスである。ページ競合アービタ６２２は、対応するバンク及びランクで現在オープンのページと競合するページに対するコマンドキュー５２０内のアクセス間のアービトレーションを行う。ページ競合アービタ６２２は、プリチャージコマンドの発行を引き起こすサブアービトレーション勝者を選択する。タイミングブロック５３４でタイマによって追跡され、ページ競合アービタ６２２によってチェックされるタイミング適格性パラメータは、例えば、ａｃｔｉｖｅｔｏｐｒｅｃｈａｒｇｅコマンド期間（ｔ_ＲＡＳ）を含む。ページ競合アービタ６２２は、アクセスの割り当てられた優先度に基づいて、サブアービトレーション勝者を選択する。ページ競合アービタ６２２が同じ優先度レベルで２つ以上の要求を検出した場合、最も古いエントリが勝者となる。 A page conflict is an access to a row in that bank when another row in the bank is currently activated. The page contention arbiter 622 arbitrates between accesses in the command queue 520 for pages that compete with the currently open page in the corresponding bank and rank. The page contention arbiter 622 selects the sub-arbitration winner that causes the precharge command to be issued. The timing eligibility parameter tracked by the timer at timing block 534 and checked by the page contention arbiter 622 includes, for example, an active to precharge command period (t _RAS ). The page contention arbiter 622 selects a sub-arbitration winner based on the assigned priority of access. If the page contention arbiter 622 detects two or more requests with the same priority level, the oldest entry is the winner.

ページミスは、プリチャージ状態にあるバンクへのアクセスである。ページミスアービタ６３２は、プリチャージされたメモリバンクに対するコマンドキュー５２０内のアクセス間のアービトレーションを行う。タイミングブロック５３４でタイマによって追跡され、ページミスアービタ６３２によってチェックされるタイミング適格性パラメータは、例えば、ｐｒｅｃｈａｒｇｅコマンド期間（ｔＲＰ）を含む。同じ優先度レベルでページミスである２つ以上の要求が存在する場合、最も古いエントリが勝者となる。 A page miss is an access to a bank that is in a precharge state. The page miss arbiter 632 arbitrates between accesses in the command queue 520 for the precharged memory bank. The timing eligibility parameter tracked by the timer at timing block 534 and checked by the page miss arbiter 632 includes, for example, a precharge command period (tRP). If there are two or more requests that are page misses at the same priority level, the oldest entry is the winner.

各サブアービタは、各々のサブアービトレーション勝者の優先度値を出力する。最終アービタ６５０は、ページヒットアービタ６１２、ページ競合アービタ６２２及びページミスアービタ６３２の各々からのサブアービトレーション勝者の優先度値を比較する。最終アービタ６５０は、一度に２つのサブアービトレーション勝者を考慮して、相対優先度比較のセットを実行することによって、サブアービトレーション勝者間の相対優先度を決定する。 Each sub-arbiter outputs the priority value of each sub-arbitration winner. Final arbiter 650 compares the priority values of the sub-arbitration winners from each of page hit arbiter 612, page contention arbiter 622, and page miss arbiter 632. The final arbiter 650 determines the relative priority between the sub-arbitration winners by performing a set of relative priority comparisons considering the two sub-arbitration winners at a time.

最終アービタ６５０は、３つのサブアービトレーション勝者間の相対優先度を決定した後に、サブアービトレーション勝者が競合するか否か（即ち、それらが同じバンク及びランクを対象としているかどうか）を判断する。かかる競合がない場合、最終アービタ６５０は、最高の優先度を有する最大２つのサブアービトレーション勝者を選択する。競合が生じた場合、最終アービタ６５０は、以下のルールに従う。最終アービタ６５０は、ページヒットアービタ６１２のサブアービトレーション勝者の優先度値がページ競合アービタ６２２の優先度値よりも高く、これらが両方とも同じバンク及びランクに対するものである場合に、ページヒットアービタ６１２によって示されたアクセスを選択する。最終アービタ６５０は、ページ競合アービタ６２２のサブアービトレーション勝者の優先度値がページヒットアービタ６１２の優先度値よりも高く、これらが両方とも同じバンク及びランクに対するものである場合に、いくつかの追加要因に基づいて勝者を選択する。場合によっては、ページクローズプレディクタ６６２は、自動プリチャージ属性を設定することによって、ページヒットアービタ６１２によって示されたアクセスの終了時にページを閉じる。 After determining the relative priority between the three sub-arbitration winners, the final arbiter 650 determines whether the sub-arbitration winners compete (ie, whether they are targeted to the same bank and rank). If there is no such conflict, the final arbiter 650 selects up to two sub-arbitration winners with the highest priority. If a conflict occurs, the final arbiter 650 follows the following rules: The final arbiter 650 determines that the page hit arbiter 612 has a sub-arbitration winner priority value that is higher than the page conflict arbiter 622 priority value, both of which are for the same bank and rank. Select the indicated access. The final arbiter 650 has several additional factors when the priority value of the sub-arbitration winner of the page contention arbiter 622 is higher than the priority value of the page hit arbiter 612, both of which are for the same bank and rank. Select a winner based on. In some cases, the page close predictor 662 closes the page at the end of the access indicated by the page hit arbiter 612 by setting the automatic precharge attribute.

ページヒットアービタ６１２内では、優先度は、メモリアクセスエージェントからの要求優先度によって最初に設定されるが、アクセスのタイプ（読出し又は書込み）及びアクセスのシーケンスに基づいて動的に調整される。概して、ページヒットアービタ６１２は、読出しに対してより高い暗黙の優先度を割り当てるが、書込みが完了に向けて進行するのを保証するための優先度上昇メカニズムを実装する。 Within the page hit arbiter 612, the priority is initially set by the request priority from the memory access agent, but is dynamically adjusted based on the type of access (read or write) and the sequence of access. In general, the page hit arbiter 612 assigns a higher implicit priority to reads, but implements a priority raising mechanism to ensure that writes progress toward completion.

ページクローズプレディクタ６６２は、ページヒットアービタ６１２が読出し又は書込みコマンドを選択すると、自動プリチャージ（ＡＰ）属性を有するコマンドを送信するか否かを決定する。読出し又は書込みサイクル中、自動プリチャージ属性は、事前に定義されたアドレスビットで設定されており、読出し又は書込みサイクルが完了した後に自動プリチャージ属性によってＤＤＲデバイスがページを閉じることによって、メモリコントローラが後でそのバンクに対して別個のプリチャージコマンドを送信する必要性を回避する。ページクローズプレディクタ６６２は、選択されたコマンドと同じバンクにアクセスする他の要求であって、コマンドキュー５２０内に既に存在する他のリクエストを考慮する。ページクローズプレディクタ６６２がメモリアクセスをＡＰコマンドに変換する場合には、そのページへの次のアクセスはページミスとなる。 The page close predictor 662 determines whether to send a command having an automatic precharge (AP) attribute when the page hit arbiter 612 selects a read or write command. During a read or write cycle, the auto precharge attribute is set with a pre-defined address bit, and the memory controller causes the memory controller to close when the page is closed by the auto precharge attribute after the read or write cycle is complete. Avoid the need to send a separate precharge command to the bank later. Page close predictor 662 considers other requests to access the same bank as the selected command, which already exist in command queue 520. If page close predictor 662 converts a memory access to an AP command, the next access to that page results in a page miss.

アービタ５３８は、メモリコントローラクロックサイクル毎に１つのコマンド又は２つのコマンドの何れかの発行をサポートする。例えば、ＤＤＲ４３２００は、１６００ＭＨｚのメモリクロック周波数で動作するＤＤＲ４ＤＲＡＭのスピードビンである。集積回路処理技術によって、メモリコントローラ５００が１６００ＭＨｚで動作することができる場合、メモリコントローラ５００は、メモリコントローラクロックサイクル毎に１つのメモリアクセスを発行することができる。この場合、最終アービタ６５０は、メモリコントローラクロックサイクル毎に単一のアービトレーション勝者のみを選択する１Ｘモードで動作することができる。 Arbiter 538 supports issuing either one command or two commands per memory controller clock cycle. For example, DDR4 3200 is a DDR4 DRAM speed bin that operates at a memory clock frequency of 1600 MHz. If the integrated circuit processing technique allows the memory controller 500 to operate at 1600 MHz, the memory controller 500 can issue one memory access every memory controller clock cycle. In this case, the final arbiter 650 can operate in a 1X mode that selects only a single arbitration winner per memory controller clock cycle.

但し、ＤＤＲ４３６００又はＬＰＤＤＲ４４６６７等の高速メモリの場合、１６００ＭＨｚのメモリコントローラのクロック速度は、メモリバスの全帯域幅を使用するには遅すぎる場合がある。アービタ５３８は、これらの高性能のＤＲＡＭに対応するために、最終アービタ６５０がメモリコントローラクロックサイクル毎に２つのコマンド（ＣＭＤ１及びＣＭＤ２）を選択する２Ｘモードをサポートする。アービタ５３８は、このモードを提供して、各サブアービタがより遅いメモリコントローラクロックを使用して並列に動作することを可能にする。図６に示すように、アービタ５３８は３つのサブアービタを含み、２Ｘモードでは、最終アービタ６５０は、３つの勝者のうち最適な２つの勝者として２つのアービトレーション勝者を選択する。 However, for high speed memories such as DDR4 3600 or LPDDR4 4667, the clock speed of the 1600 MHz memory controller may be too slow to use the full bandwidth of the memory bus. Arbiter 538 supports a 2X mode in which final arbiter 650 selects two commands (CMD1 and CMD2) every memory controller clock cycle to accommodate these high performance DRAMs. Arbiter 538 provides this mode to allow each sub-arbiter to operate in parallel using a slower memory controller clock. As shown in FIG. 6, arbiter 538 includes three sub-arbiters, and in 2X mode, final arbiter 650 selects two arbitration winners as the best two winners of the three winners.

２Ｘモードでは、メモリコントローラ５００は、最高速度よりも遅いメモリコントローラクロック速度で動作して、メモリコントローラコマンド生成をメモリクロックサイクルに合わせることが可能であることに留意されたい。メモリコントローラが、最大１６００ＭＨｚのクロック速度で動作可能なＤＤＲ４３６００の例では、クロック速度を、２Ｘモードにおいて９００ＭＨｚまで低減することができる。 Note that in 2X mode, the memory controller 500 can operate at a memory controller clock speed that is slower than the maximum speed to synchronize memory controller command generation with the memory clock cycle. In the example of DDR4 3600 where the memory controller can operate at a clock speed of up to 1600 MHz, the clock speed can be reduced to 900 MHz in 2X mode.

異なるメモリアクセスタイプに対して異なるサブアービタを使用することによって、各アービタは、全てのアクセスタイプ（ページヒット、ページミス及びページ競合）間のアービトレーションを行うことが必要な場合よりも単純なロジックで実装され得る。したがって、アービトレーションロジックを単純化することができ、アービタ５３８のサイズを比較的小さく保つことができる。ページヒット、ページ競合及びページミスのためにサブアービタを使用することによって、アービタ５３８は、データ転送を伴うアクセスのレイテンシを隠すために、互いに適したペアとなる２つのコマンドの選択を可能にする。 By using different sub-arbiters for different memory access types, each arbiter is implemented with simpler logic than would be required to arbitrate between all access types (page hits, page misses and page conflicts) Can be done. Therefore, the arbitration logic can be simplified and the size of the arbiter 538 can be kept relatively small. By using a sub-arbiter for page hits, page conflicts, and page misses, the arbiter 538 allows the selection of two commands that are paired together to hide the latency of access with data transfer.

他の実施形態では、アービタ５３８は、２Ｘモードをサポートするために少なくとも２つのサブアービタを有する限り、異なる数のサブアービタを含むことができる。例えば、アービタ５３８は、４つのサブアービタを含んでもよく、メモリコントローラクロックサイクル毎に最大４つのアクセスが選択されるのを可能にする。さらに他の実施形態では、アービタ５３８は、任意の単一タイプの２つ以上のサブアービタを含むことができる。例えば、アービタ５３８は、２つ以上のページヒットアービタ、２つ以上のページ競合アービタ、及び／又は、２つ以上のページミスアービタを含むことができる。この場合、アービタ５３８は、各コントローラサイクルで同じタイプの２つ以上のアクセスを選択することができる。 In other embodiments, the arbiter 538 can include a different number of sub-arbiters as long as it has at least two sub-arbiters to support 2X mode. For example, arbiter 538 may include four sub-arbiters, allowing up to four accesses to be selected per memory controller clock cycle. In still other embodiments, the arbiter 538 can include any single type of two or more sub-arbiters. For example, arbiter 538 may include two or more page hit arbiters, two or more page contention arbiters, and / or two or more page miss arbiters. In this case, the arbiter 538 can select more than one access of the same type in each controller cycle.

図５及び図６の回路は、ハードウェア及びソフトウェアの様々な組み合わせで実装されてもよい。例えば、ハードウェア回路は、プライオリティエンコーダ、有限ステートマシン、プログラマブルロジックアレイ（ＰＬＡ）等を含んでもよく、アービタ５３８は、待ち状態のコマンドの相対タイミング適格性を評価するために、記憶されたプログラム命令を実行するマイクロコントローラで実装され得る。この場合、いくつかの命令は、マイクロコントローラによる実行のために、非一時的なコンピュータメモリ又はコンピュータ可読記憶媒体に記憶されてもよい。様々な実施形態では、非一時的なコンピュータ可読記憶媒体は、磁気若しくは光ディスク記憶デバイス、例えばフラッシュメモリ等のソリッドステート記憶デバイス、又は、他の不揮発性メモリデバイスを含む。非一時的なコンピュータ可読記憶媒体に記憶されたコンピュータ可読命令は、ソースコード、アセンブリ言語コード、オブジェクトコード、又は、１つ以上のプロセッサによって解釈及び／若しくは実行可能な他の命令フォーマットであってもよい。 The circuits of FIGS. 5 and 6 may be implemented with various combinations of hardware and software. For example, the hardware circuit may include a priority encoder, a finite state machine, a programmable logic array (PLA), etc., and the arbiter 538 may store stored program instructions to evaluate the relative timing eligibility of waiting commands. May be implemented with a microcontroller that performs In this case, some instructions may be stored in non-transitory computer memory or computer readable storage media for execution by the microcontroller. In various embodiments, non-transitory computer readable storage media include magnetic or optical disk storage devices, eg solid state storage devices such as flash memory, or other non-volatile memory devices. The computer readable instructions stored on the non-transitory computer readable storage medium may be source code, assembly language code, object code, or other instruction format that can be interpreted and / or executed by one or more processors. Good.

図１のＡＰＵ１１０、図５のメモリコントローラ５００又はこれらの一部（例えば、アービタ５３８等）は、プログラムによって読出され、集積回路を製造するために直接的若しくは間接的に使用されるデータベース又は他のデータ構造の形態のコンピュータアクセス可能なデータ構造によって記述されてもよいし表現されてもよい。例えば、このデータ構造は、例えばＶｅｒｉｌｏｇ又はＶＨＤＬ等の高水準設計言語（ＨＤＬ）におけるハードウェア機能の動作レベル記述であってもよいし、レジスタ転送レベル（ＲＴＬ）記述であってもよい。記述は、ゲートのリストを含むネットリストを合成ライブラリから生成するために当該記述を合成し得る合成ツールによって読出されてもよい。ネットリストは、集積回路を含むハードウェアの機能を表すゲートのセットを含む。そして、ネットリストを配置及びルーティングして、マスクに適用される幾何学的形状を記述するデータセットを生成してもよい。マスクは、集積回路を製造するために様々な半導体製造工程で使用されてもよい。或いは、コンピュータアクセス可能な記憶媒体上のデータベースは、所望により、ネットリスト（合成ライブラリ有り若しくは無し）又はデータセットであってもよいし、グラフィックデータシステム（ＧＤＳ）ＩＩデータであってもよい。 The APU 110 of FIG. 1, the memory controller 500 of FIG. 5, or a portion thereof (eg, arbiter 538, etc.) is read by a program and used directly or indirectly to manufacture an integrated circuit or other It may be described or represented by a computer-accessible data structure in the form of a data structure. For example, this data structure may be an operation level description of a hardware function in a high level design language (HDL) such as Verilog or VHDL, or a register transfer level (RTL) description. The description may be read by a synthesis tool that can synthesize the description to generate a netlist including a list of gates from the synthesis library. The netlist includes a set of gates that represent the functions of the hardware that contains the integrated circuit. The netlist may then be placed and routed to generate a data set that describes the geometric shape applied to the mask. Masks may be used in various semiconductor manufacturing processes to manufacture integrated circuits. Alternatively, the database on a computer-accessible storage medium may be a netlist (with or without a synthesis library) or data set, or graphic data system (GDS) II data, as desired.

特定の実施形態について説明してきたが、これらの実施形態に対する様々な修正が当業者には明らかであろう。例えば、メモリチャネルコントローラ５１０及び／又は電力エンジン５５０の内部アーキテクチャは、異なる実施形態において変更することができる。メモリコントローラ５００は、例えば高帯域幅メモリ（ＨＢＭ）、ＲＡＭバスＤＲＡＭ（ＲＤＲＡＭ）等のようなＤＤＲｘメモリ以外の他のタイプのメモリにインタフェースすることができる。例示された実施形態では、別々のＤＩＭＭに対応するメモリの各ランクを示したが、他の実施形態では、各ＤＩＭＭは複数のランクをサポートすることができる。 Although particular embodiments have been described, various modifications to these embodiments will be apparent to those skilled in the art. For example, the internal architecture of the memory channel controller 510 and / or the power engine 550 can be changed in different embodiments. The memory controller 500 can interface to other types of memory other than DDRx memory, such as, for example, high bandwidth memory (HBM), RAM bus DRAM (RDRAM), and the like. In the illustrated embodiment, each rank of memory corresponding to a different DIMM is shown, but in other embodiments, each DIMM can support multiple ranks.

したがって、添付の特許請求の範囲によって、開示された実施形態の範囲内に含まれる、開示された実施形態の全ての変更を包含することが意図される。 Accordingly, the appended claims are intended to cover all modifications of the disclosed embodiments that fall within the scope of the disclosed embodiments.

１つの形態では、本明細書に開示されたメモリコントローラは、コマンドキューと、複数のサブアービタを含むアービタと、を備える。１つの態様によれば、複数のサブアービタは、第１サブアービトレーション勝者、第２サブアービトレーション勝者及び第３サブアービトレーション勝者を提供するための第１サブアービタ、第２サブアービタ及び第３サブアービタと、２つの最終アービトレーション勝者を選択するための最終アービタと、を含み、最終アービタは、第１アービトレーション勝者、第２アービトレーション勝者及び第３アービトレーション勝者とオーバヘッドコマンドとから、２つの最終アービトレーション勝者を選択する。この場合、オーバヘッドコマンドは、パワーダウンコマンド、オートリフレッシュコマンド及び較正コマンドのうち何れかを含んでもよい。In one form, a memory controller disclosed herein comprises a command queue and an arbiter that includes a plurality of sub-arbiters. According to one aspect, the plurality of sub-arbiters are a first sub-arbiter, a second sub-arbiter and a third sub-arbiter for providing a first sub-arbitration winner, a second sub-arbitration winner and a third sub-arbitration winner, and two final A final arbiter for selecting an arbitration winner, and the final arbiter selects two final arbitration winners from the first arbitration winner, the second arbitration winner, and the third arbitration winner and an overhead command. In this case, the overhead command may include any of a power down command, an auto refresh command, and a calibration command.

別の形態では、本明細書に開示されたメモリコントローラは、メモリアクセスエージェントと、メモリシステムと、メモリアクセスエージェント及びメモリシステムに接続されたメモリコントローラと、を含むデータ処理システムの一部である。In another form, the memory controller disclosed herein is part of a data processing system that includes a memory access agent, a memory system, and a memory controller connected to the memory access agent and the memory system.

さらに別の形態では、方法は、複数のメモリアクセス要求を受信することと、複数のメモリアクセス要求をコマンドキューに記憶することと、コマンドキューからメモリアクセス要求を選択することであって、第１コントローラサイクル期間中にメモリアクセス要求の中から複数のサブアービトレーション勝者を選択することと、対応する複数のメモリコマンドサイクルにおいて複数のコマンドを提供するために複数のサブアービトレーション勝者の中から何れかを選択することとを含む、ことと、を含む。１つの態様によれば、方法は、対応する第２の複数のメモリサイクルにおいて第２の複数のメモリコマンドを提供するために、複数のサブアービトレーション勝者の何れかと、オーバーヘッドコマンドとを選択することと、オーバヘッドコマンドを、パワーダウンコマンド、オートリフレッシュコマンド及び較正コマンドのうち何れかとして提供することと、をさらに含む。別の態様によれば、複数のメモリコマンドを提供するために、複数のサブアービトレーション勝者の中から何れかを選択することは、対応する複数のメモリコマンドサイクルにおいて複数のメモリコマンドを提供するために、複数のサブアービトレーション勝者の中から何れかを選択することを含み、メモリコマンドサイクルは、コントローラサイクルよりも短い。さらに別の態様によれば、第１コントローラサイクル中に、メモリアクセス要求の中から複数のサブアービトレーション勝者を選択することは、第１コントローラサイクル中に、メモリアクセス要求の中から同じタイプの第１の複数のサブアービトレーション勝者を選択することを含み、方法は、第１コントローラサイクル中に、同じタイプの２つの最終アービトレーション勝者を選択することをさらに含む。In yet another aspect, the method includes receiving a plurality of memory access requests, storing the plurality of memory access requests in a command queue, and selecting a memory access request from the command queue, Select multiple sub-arbitration winners from memory access requests during the controller cycle and select one of multiple sub-arbitration winners to provide multiple commands in corresponding multiple memory command cycles Including, including. According to one aspect, a method selects any of a plurality of sub-arbitration winners and an overhead command to provide a second plurality of memory commands in a corresponding second plurality of memory cycles. Providing the overhead command as any of a power down command, an auto refresh command and a calibration command. According to another aspect, selecting one of a plurality of sub-arbitration winners to provide a plurality of memory commands is provided to provide a plurality of memory commands in a corresponding plurality of memory command cycles. The memory command cycle is shorter than the controller cycle, including selecting any of the plurality of sub-arbitration winners. According to yet another aspect, selecting a plurality of sub-arbitration winners from among the memory access requests during the first controller cycle means that a first of the same type from among the memory access requests during the first controller cycle. Selecting a plurality of sub-arbitration winners, and the method further includes selecting two final arbitration winners of the same type during the first controller cycle.

Claims

A command queue (520) for receiving and storing a memory access request;
A plurality of sub-arbiters (605) for providing a corresponding plurality of sub-arbitration winners from among the memory access requests during a controller cycle, the plurality of sub-arbitrations for providing a plurality of memory commands in a corresponding controller cycle An arbiter (538) comprising a plurality of sub-arbiters (605) for selecting any of the winners,
Memory controller (500).

A memory command cycle is shorter than the corresponding controller cycle;
The memory controller (500) of claim 1.

The controller cycle is defined by a controller clock signal;
The memory command cycle is defined by a memory clock signal,
The memory clock signal has a higher frequency than the controller clock signal;
The memory controller (500) of claim 2.

The frequency of the memory clock signal is twice the frequency of the controller clock signal.
The memory controller (500) of claim 3.

The plurality of sub-arbiters (605)
A first sub-arbiter (610) connected to the command queue (520), wherein a first sub-arbitration winner is determined from among active entries in the command queue (520) in synchronization with a controller clock signal. 1 sub-arbiter (610),
A second sub-arbiter (620) connected to the command queue (520), wherein the first sub-arbitration winner is selected from among the active entries in the command queue (520) in synchronization with the controller clock signal. A second sub-arbiter (620) for determining different second sub-arbitration winners,
The memory controller (500) outputs the first sub-arbitration winner as a first memory command in a first cycle of a memory clock signal and outputs the second sub-arbitration winner in a second memory in a subsequent cycle of the memory clock signal. It operates to output as a command, the frequency of the memory clock signal is higher than the frequency of the controller clock signal,
The memory controller (500) of claim 1.

The plurality of sub-arbiters (605)
A third sub-arbiter (630) connected to the command queue (520), and a third sub-arbitration winner is determined from active entries in the command queue (520) in synchronization with the controller clock signal. A third sub-arbiter (630);
The memory controller (500) of claim 5.

The arbiter (538)
Two final arbitration winners are selected from the first sub-arbitration winner, the second sub-arbitration winner, and the third sub-arbitration winner, and the two final arbitration winners are selected as the first memory command and the second memory command. With a final arbiter (650) to serve as
The memory controller (500) of claim 6.

The final arbiter (650) selects the two final arbitration winners from the first sub-arbitration winner, the second sub-arbitration winner, the third sub-arbitration winner, and an overhead command;
The memory controller (500) of claim 7.

The overhead command includes any one of a power down command, an auto refresh command, and a calibration command.
The memory controller (500) of claim 8.

The plurality of sub-arbiters (605) includes at least one other sub-arbiter of the same type as any of the first sub-arbiter (610), the second sub-arbiter (620), and the third sub-arbiter (630),
The final arbiter (650) selects two final arbitration winners of the same type from the plurality of sub-arbiters (605) in the corresponding controller cycle;
The memory controller (500) of claim 7.

The first sub-arbiter (610) selects the first sub-arbitration winner from page hit commands in the command queue (520),
The second sub-arbiter (620) selects the second sub-arbitration winner from page conflict commands in the command queue (520);
The third sub-arbiter (630) selects the third sub-arbitration winner from a page miss command in the command queue (520).
The memory controller (500) of claim 6.

Each of the plurality of sub-arbiters (605) selects an arbitration winner from among related types of commands in the command queue (520);
At least two of the plurality of sub-arbiters (605) select the same type of arbitration winner;
The arbiter (538) selects two final arbitration winners of the same type from the plurality of sub-arbiters (605) in the corresponding controller cycle;
The memory controller (500) of claim 1.

A memory access agent (110, 210, 220) for providing a memory access request;
A memory system (120);
A memory controller (292, 500) connected to the memory access agent (110, 210, 220) and the memory system (120);
The memory controller (292,500)
A command queue (520) for storing a memory access command received from the memory access agent (110, 210, 220);
A plurality of sub-arbiters (605) for providing a corresponding plurality of sub-arbitration winners from among the memory access requests during a controller cycle, the plurality of sub-arbitrations for providing a plurality of memory commands in a corresponding controller cycle An arbiter (538) comprising a plurality of sub-arbiters (605) for selecting any of the winners,
Data processing system (100).

The memory access agent is
A central processing unit core (212, 214);
A graphics processing unit core (220);
A data fabric (250) interconnecting the central processing unit core (212, 214) and the graphics processing unit core (220) to the memory controller (292, 500);
The data processing system (100) of claim 13.

The memory command cycle is shorter than the controller cycle,
The data processing system (100) of claim 13.

The controller cycle is defined by a controller clock signal;
The memory command cycle is defined by a memory clock signal,
The memory clock signal has a higher frequency than the controller clock signal;
The data processing system (100) of claim 15.

The frequency of the memory clock signal is twice the frequency of the controller clock signal.
The data processing system (100) of claim 16.

The plurality of sub-arbiters (605)
A first sub-arbiter (610) connected to the command queue (520), wherein a first sub-arbitration winner is determined from among active entries in the command queue (520) in synchronization with a controller clock signal. 1 sub-arbiter (610),
A second sub-arbiter (620) connected to the command queue (520), wherein the first sub-arbitration winner is selected from among the active entries in the command queue (520) in synchronization with the controller clock signal. A second sub-arbiter (620) for determining different second sub-arbitration winners,
The memory controller (500) outputs the first sub-arbitration winner as a first memory command in a first cycle of a memory clock signal and outputs the second sub-arbitration winner in a second memory in a subsequent cycle of the memory clock signal. It operates to output as a command, the frequency of the memory clock signal is higher than the frequency of the controller clock signal,
The data processing system (100) of claim 13.

The plurality of sub-arbiters (605)
A third sub-arbiter (630) connected to the command queue (520), and a third sub-arbitration winner is determined from active entries in the command queue (520) in synchronization with the controller clock signal. A third sub-arbiter (630);
The data processing system (100) of claim 18.

The arbiter (538)
Two final arbitration winners are selected from the first sub-arbitration winner, the second sub-arbitration winner, and the third sub-arbitration winner, and the two final arbitration winners are selected as the first memory command and the second memory command. With a final arbiter (650) to serve as
The data processing system (100) of claim 19.

The plurality of sub-arbiters (605) includes at least one other sub-arbiter of the same type as any of the first sub-arbiter (610), the second sub-arbiter (620), and the third sub-arbiter (630),
The final arbiter (650) selects two final arbitration winners of the same type from the plurality of sub-arbiters (605) in the corresponding controller cycle;
The data processing system (100) of claim 20.

The first sub-arbiter (610) selects the first sub-arbitration winner from page hit commands in the command queue (520),
The second sub-arbiter (620) selects the second sub-arbitration winner from page conflict commands in the command queue (520);
The third sub-arbiter (630) selects the third sub-arbitration winner from a page miss command in the command queue (520).
The data processing system (100) of claim 19.

Each of the plurality of sub-arbiters (605) selects an arbitration winner from among related types of commands in the command queue (520);
At least two of the plurality of sub-arbiters (605) select the same type of arbitration winner;
The arbiter (538) selects two final arbitration winners of the same type from the plurality of sub-arbiters (605) in the corresponding controller cycle;
The data processing system (100) of claim 13.

Receiving multiple memory access requests;
Storing the plurality of memory access requests in a command queue (520);
Selecting a memory access request from the command queue (520), selecting a plurality of sub-arbitration winners from the memory access request during a first controller cycle; and a plurality of memories in a corresponding controller cycle. Selecting any of the plurality of sub-arbitration winners to provide a command.
Method.

Selecting the plurality of sub-arbitration winners includes
Selecting a first sub-arbitration winner from a page hit command in the command queue (520);
Selecting a second sub-arbitration winner from the page contention commands in the command queue (520);
Selecting a third sub-arbitration winner from a page miss command in the command queue (520).
25. The method of claim 24.

Selecting a fourth sub-arbitration winner from any of the page hit command, the page conflict command and the page miss command in the command queue;
Selecting two final arbitration winners of the same type from the first sub-arbitration winner, the second sub-arbitration winner, the third sub-arbitration winner, and the fourth sub-arbitration winner in the first controller cycle; ,including,
26. The method of claim 25.

Selecting any of the plurality of sub-arbitration winners and overhead commands to provide a second plurality of memory commands in a corresponding second plurality of memory cycles;
25. The method of claim 24.

Providing the overhead command as any of a power down command, an auto refresh command, and a calibration command;
28. The method of claim 27.

Selecting any of the plurality of sub-arbitration winners to provide the plurality of memory commands;
Selecting any of the plurality of sub-arbitration winners to provide the plurality of memory commands in a corresponding memory command cycle, wherein the memory command cycle is shorter than the controller cycle ,
25. The method of claim 24.

Selecting a plurality of sub-arbitration winners from among the memory access requests during the first controller cycle is a first plurality of sub-arbitration winners of the same type from among the memory access requests during the first controller cycle. Including selecting
Selecting two final arbitration winners of the same type during the first controller cycle;
25. The method of claim 24.