pursuing cool things Mainly high performance implemetations of common funcs matrix multiplication fast: cache-friendly(blocked, transform), avx memory benchmark fast memset/memcpy implementation: avx/stream/multi-threading page fault Study the page fault cost of normal page and huge page