[B! CPU] moozのブックマーク

８５００の思い出

mooz 2017/05/14

日立のメインフレームHITAC 8000の開発思い出話。面白い！

computer
cpu

リンク

d d is the parameter describing the direction; it can hold the values left and right. Integer number Here integer numbers are values which are mostly unsigned. Numbers might be encoded in other bases such as 10 as was done on ancient computers. Today only base 2 (i.e. bits) is used, and this is what is crucially needed for all bit operations. If signed numbers are present, it is assumed that they

mooz 2015/05/20

sheep and goat.

cpu
binary

リンク

x86 Bit manipulation instruction set - Wikipedia

Bit manipulation instructions sets (BMI sets) are extensions to the x86 instruction set architecture for microprocessors from Intel and AMD. The purpose of these instruction sets is to improve the speed of bit manipulation. All the instructions in these sets are non-SIMD and operate only on general-purpose registers. There are two sets published by Intel: BMI (now referred to as BMI1) and BMI2; th

mooz 2015/05/20

cpu
binary

リンク

BID Data Project | Big Data Analytics with Small Footprint

Welcome to the BID Data Project! Here you will find resources for the fastest Big Data tools on the Web. See our Benchmarks on github. BIDMach running on a single GPU-equipped host holds the records for many common machine learning probl ems, on single nodes or clusters. Try It! BIDMach is an interactive environment designed to make it extremely easy to build and use machine learning models. BIDMac

mooz 2014/10/07

リンク

Hisa Ando の記事一覧(1ページ目) | マイナビニュース

コンピュータアーキテクチャの話第481回独自メニーコアチップでTop500 1位を獲得した中国の「神威・太湖之光」スパコン

mooz 2014/04/17

リンク

Why is a CPU branch instruction slow?

mooz 2014/02/19

branch

リンク

Software Prefetches Performance Counter

Software Prefetches PAPI_PRF_SW Retired software prefetches. x86 and x86_64 There are various kinds of software prefetches on x86/x86_64. Four of them came with SSE1 (note this is implementation specific; on AMD prefetcht0/t1/t2 all do the same thing, and on intel P4 tends to treat them differently than others): PREFETCHNTA - non temporal, meaning you plan to use it once and never again PREFETCHT0

mooz 2013/07/10

プリフェッチの指定. NTA, T0~T2

x86
cpu

リンク

Xeon Phi - Wikipedia

Xeon Phi[3] is a discontinued series of x86 manycore processors designed and made by Intel. It was intended for use in supercomputers, servers, and high-end workstations. Its architecture allowed use of standard programming languages and application programming interfaces (APIs) such as OpenMP.[4][5] Xeon Phi launched in 2010. Since it was originally based on an earlier GPU design (codenamed "Larr

mooz 2013/06/18

Intel Many Integrated Core Architecture (MIC)

リンク

32bitと64bitのサイズの違い(C言語): のぼメモ(仮)

32bitと64bit環境ではサイズが違うのでメモメモ。色々な型に対してsizeof()関数を使ってみた結果は以下の通り。 OS間または32bit/64bit間でサイズが変わる型を使う場合は移植に注意が必要。

mooz 2012/10/16

syzeof long は Windows 64bit だと 4 byte になる。

リンク

2011年3月25日号　複数アーキテクチャの混在環境・クラウドベースVDI・Upstart 1.2 | gihyo.jp

Ubuntu Weekly Topics 2011年3月25日号複数アーキテクチャの混在環境・クラウドベースVDI・Upstart 1.2 Nattyの開発 11.04（Natty）のリリースまで約一ヶ月となり、Nattyの開発が加速しています。Unityなどの新機能については大幅に変更が加えられ、機能面では「ほぼ実用」と言える状態になってきています。開発版に特有のクラッシュバグやおかしな挙動は残っていますが、それなりの準備をすれば、実機に導入してテストを開始できる状態と言えるでしょう（とはいえ、いわゆる本番環境に用いるにはまだまだ危険です。十分なバックアップ体制や予備機を準備して利用するべきでしょう。特定のアップデートを適用すると起動不能になる、というのは良くあることです⁠）⁠。なお、NattyのXのABI変更に伴い、一時的に最新のXパッケージが依存関係によりアップデート/インストール

mooz 2012/08/23

linux
cpu

リンク

Non-uniform memory access - Wikipedia

The motherboard of an HP Z820 workstation with two CPU sockets, each with their own set of eight DIMM slots surrounding the socket. Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory (memory lo

mooz 2012/07/14

ccNUMA: cache coherent NUMA

memory
CPU

リンク

CPUの性能の説明

＝CPUってなに？＝ CPU とは「セントラル・プロセッシング・ユニット」の略で、「中央処理装置」という意味です。パソコンの中心となり、パソコン全体の処理・計算を行う、まさに頭脳と言える部分です。ですからこのパーツの良し悪しが、パソコンの性能に直結すると言っても過言ではありません。それほど重要なパーツです。 CPU が良いものであるほど、そのコンピュータは複雑で多くの処理も、速く安定して行える訳です。 CPU はこのような平べったいタイルの様な感じです。左の画像ものは黒と緑の色をしていますが、種類によって多くの色があり、最近は白いものが多くなっています。 CPU の裏面にはたくさんのトゲトゲの突起があります。 CPU をはめるマザーボード（基盤）側にはたくさんの小さいツブツブの穴があって、このトゲトゲをツブツブに合わせてはめ込みます。（最近はマザーボードの側に

mooz 2012/06/26

リンク

Latency numbers every programmer should know — Gist

Forks gist: 2843573 by chetan Latency numbers every progr... created May 31, 2012 gist: 2844153 by mikea Latency numbers every progr... created May 31, 2012 gist: 2844932 by adragomir Latency numbers every progr... created May 31, 2012 gist: 2850587 by Bamco Latency numbers every progr... created June 01, 2012 gist: 2851124 by Stals Latency numbers every progr... created June 01, 2012 gist: 285208

mooz 2012/06/02

各種レイテンシ．キャッシュミス，分岐予測失敗，mutex，メモリ参照，圧縮，ネットワーク送信，ディスクシークなど．

リンク

並列プログラミング（その２） : DSAS開発者の部屋

3.Memory Ordering シングルプロセッサのマルチスレッドでは、volatile変数をフラグにして簡単な同期を書くことができました。例えば、次のような感じです。（コンパイラはvolatile変数へのアクセスの順序を入れ替えないものとします） volatile int done = 0; volatile struct { int foo; int bar; } foobar; void writer(void) { foobar.foo = fizz(); foobar.bar = bazz(); done = 1; } void reader(void) { int foo, bar; while (!done) sleep(1); foo = foobar.foo; bar = foobar.bar; } これは、マルチプロセッサ環境では上手くいかないことがあります。今時

mooz 2012/05/25

メモリバリア. lfence, sfence, mfence.

リンク

CPU とキャッシュのはなし - graphics.hatenablog.com

別にグラフィックスに限ったことじゃないし、そもそも論文とか全然関係ないけど。GPU 周りでもたまに話題になるし、自分でもたまにわけわからんくなるから整理しとく。メインメモリは遅い CPU からメインメモリにデータを読みに行く場合、これはとにかく遅い。例えばレジスタにあるデータを読みに行く場合と比べると、だいたい数倍から数100倍の遅さ。ヤバいからなんとかしよう。もっと早くアクセスできる場所にデータおいとこう。キャッシュライン CPU がメインメモリからデータを読み出すとき、必ず小さなメモリチャンクをキャッシュ上にロードする。ロード単位はプロセッサによるけど、だいたい 8 ～ 512 バイト。このロード単位をキャッシュラインと呼ぶ。アクセス対象のデータが既にキャッシュに載ってる場合は、メインメモリじゃなくてキャッシュを読みに行く。ない場合はメインメモリにアクセスするけど、そのデータはも

mooz 2012/05/04

cpu

リンク

【インタビュー】超並列アーキテクチャとディペンダビリティ - プロセッサ開発の今後 (1) データフローマシンとアウトオブオーダー処理技術 | エンタープライズ | マイナビニュース

東京大学大学院情報理工学系研究科電子情報学専攻坂井修一教授データフロー型と呼ばれるコンピュータアーキテクチャがある。これは、現在のコンピュータの基礎をなすフォン・ノイマン型とは異なるコンピュータアーキテクチャである。このデータフローマシンは1970年代に米MITで着想され、その後1980年代にかけて世界的に研究開発が進められた。その中で、国内では1986年よりデータフローマシンの「EM-4」が電子技術総合研究所(現:産業技術総合研究所)にて開発される。その時、アーキテクチャの研究及び試作機の開発を担当したのが、当時電子技術総合研究所に在席した坂井修一氏(現:東京大学大学院情報理工学系研究科電子情報学専攻教授)らである。今回、この坂井修一氏にデータフローマシンから将来のコンピュータに至る展望を伺ったのでご紹介したい。データフローマシンとアウトオブオーダー処理技術現在のコンピュータは、

mooz 2012/04/20

タイル型のマルチコア

cpu
hpc

リンク

CPU DB: Recording Microprocessor History - ACM Queue

April 6, 2012 Volume 10, issue 4 PDF CPU DB: Recording Microprocessor History With this open database, you can mine microprocessor trends over the past 40 years. Andrew Danowitz, Kyle Kelley, James Mao, John P. Stevenson, Mark Horowitz, Stanford University In November 1971, Intel introduced the world’s first single-chip microprocessor, the Intel 4004. It had 2,300 transistors, ran at a clock speed

mooz 2012/04/07

リンク

Intel optimization

Intelプロセッサ最適化マニュアルを読もう今まである程度は読んでたけど, 『Intel 64 and IA-32 Architectures Optimization Reference Manual』を読み直して, 気づいたことなどをまとめてみようという試み. 今手元にあるのは248966-024 April 2011のもの. 日本語版もありますが, 多少古いので英語版がよいでしょう. 表記についてプロセッサと書いたり, CPUと書いたりします. Intel64をx64と書いたりします. 例えば3サイクルのことを個人的な慣習で3clkと書くことが多いです. すいません. 間違いなどございましたらメール(herumi@nifty.com)か@herumiにお願いします. 2章 Intel64/IA-32 CPUアーキテクチャ 2.1 Sandy Bridge概要 2.1.

mooz 2012/03/31

リンク

Database Systems

Welcome to the web presence of the Database Research Group at University of Tübingen. Our group pursues a variety of “all-time classic” database research questions—prime examples include query language design, translation, and optimization—but with a few twists: We are particularly interested in the design, compilation, and optimization of expressive database languages that support rich data model

mooz 2012/03/10

CPU やメモリを意識したクエリ処理．最適化．

リンク

Math Kernel Library - Wikipedia

Intel oneAPI Math Kernel Library (Intel oneMKL) , formerly known as Intel Math Kernel Library, is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transf orms, and vector math.[5][6] The library supports x86 CPUs and Intel GPUs[2] and is available for Windows and Linux operati

mooz 2012/03/02

Intel プロセッサに最適化された行列計算ライブラリ。Julia も利用。

リンク

はてなブックマーク

タグ

関連タグで絞り込む (32)

CPUに関するmoozのブックマーク (37)

お知らせ

今週のはてなブックマーク数ランキング（2025年1月第1週）

今週のはてなブックマーク数ランキング（2024年12月第4週）

「あとで読む」タグで振り返る2024年〜今年の「あとで読む」、今年のうちに〜

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス

タグ

関連タグで絞り込む (32)

CPUに関するmoozのブックマーク (37)

お知らせ

今週のはてなブックマーク数ランキング（2025年1月第1週）

今週のはてなブックマーク数ランキング（2024年12月第4週）

「あとで読む」タグで振り返る2024年 〜今年の「あとで読む」、今年のうちに〜

公式Twitter

キーボードショートカット一覧

公式Twitter

はてなのサービス

「あとで読む」タグで振り返る2024年〜今年の「あとで読む」、今年のうちに〜