關(guān)于我們
書單推薦
新書推薦
|
BPF之巔:洞悉Linux系統(tǒng)和應(yīng)用性能 讀者對(duì)象:BPF性能工具將是所有管理員、開發(fā)人員、支持人員和其他IT專業(yè)人員不可或缺的資源,他們可以在任何企業(yè)或云環(huán)境中使用任何最新的Linux發(fā)行版。
基于BPF的性能優(yōu)化工具提供了前所未有的系統(tǒng)級(jí)和應(yīng)用程序級(jí)的觀察能力,使用這些工具可以優(yōu)化性能、調(diào)試代碼、增強(qiáng)安全性、降低成本。本書是使用這些觀察工具的全面指南。本書作為全面介紹 BPF 技術(shù)的圖書,從 BPF 技術(shù)的起源到未來發(fā)展方向都有涵蓋,不僅全面介紹了 BPF 的編程模型,還完整介紹了兩個(gè)主要的 BPF 前端編程框架 — BCC 和 bpftrace,更給出了一系列實(shí)現(xiàn)范例,全面展示了 BPF技術(shù)的實(shí)際能力和未來發(fā)展前景,以及使用BPF工具優(yōu)化性能、修復(fù)問題、探索線上系統(tǒng)的內(nèi)部情況。本書的另一個(gè)關(guān)注方向是 Linux 系統(tǒng)性能和應(yīng)用程序性能的調(diào)優(yōu)。本書全面介紹了系統(tǒng)性能調(diào)優(yōu)的策略、工具與實(shí)踐案例,不僅介紹了對(duì)應(yīng)的 BPF 工具,還著重介紹了這些工具如何與 Linux 傳統(tǒng)性能工具進(jìn)行互補(bǔ),這樣讀者可以有選擇地進(jìn)行使用。本書介紹的工具小巧精致,且包含了簡(jiǎn)單易讀的源代碼,這就是 BPF 技術(shù)的魅力所在 :安全、高效、快捷的系統(tǒng)擴(kuò)展力。未來 BPF 技術(shù)在 Linux 中的應(yīng)用場(chǎng)景會(huì)越來越多,越來越重要。希望本書能在大家學(xué)習(xí)這項(xiàng)技術(shù)并關(guān)注它的發(fā)展時(shí)提供一定的便利。本書是系統(tǒng)管理員、應(yīng)用程序開發(fā)者、運(yùn)維人員,以及其他IT從業(yè)者在企業(yè)內(nèi)部或云上使用各種Linux發(fā)行版時(shí)都可參考的資料。
Netflix 高級(jí)性能工程師 Brendan Gregg 是 BPF(eBPF)的主要貢獻(xiàn)者,他幫助開發(fā)和維護(hù)了兩個(gè)主要的 BPF 前端框架,開創(chuàng)了 BPF 用于可觀測(cè)性的先河,并創(chuàng)建了數(shù)十種基于 BPF 的性能分析工具。他編著的暢銷書有《性能之巔:洞悉系統(tǒng)、企業(yè)與云計(jì)算》。
譯者介紹孫宇聰:現(xiàn)任 Facebook 運(yùn)維工程師經(jīng)理,曾在谷歌工作多年,任谷歌高級(jí)SRE(Senior Site Reliblity Engineer),Coding.net 前技術(shù)負(fù)責(zé)人,譯有《SRE:Google運(yùn)維解密》《架構(gòu)整潔之道》等經(jīng)典暢銷技術(shù)圖書。呂宏利,資深SRE,現(xiàn)任職于谷歌基礎(chǔ)架構(gòu)部。之前曾負(fù)責(zé)谷歌搜索廣告和內(nèi)容廣告系統(tǒng)運(yùn)維工作,有多年分布式系統(tǒng)研發(fā)與運(yùn)維經(jīng)驗(yàn)。對(duì)運(yùn)維工具平臺(tái)建設(shè)、監(jiān)控、應(yīng)用性能跟蹤及分析、數(shù)據(jù)化運(yùn)維等方面有深入的研究。劉曉舟,畢業(yè)于北京大學(xué)計(jì)算機(jī)系,現(xiàn)供職于字節(jié)跳動(dòng)公司系統(tǒng)部,任系統(tǒng)架構(gòu)師。他在字節(jié)跳動(dòng)主持構(gòu)建了基于 eBPF 的大規(guī)模性能分析和網(wǎng)絡(luò)監(jiān)控診斷平臺(tái),閑暇時(shí)間也在相關(guān)開源社區(qū)提交代碼。在加入字節(jié)跳動(dòng)之前,他有 10 年國(guó)家部委電子政務(wù)和大數(shù)據(jù)研究經(jīng)歷。
第1章 引 言.................................................................................................................1
1.1 BPF和eBPF是什么 .............................................................................................. 1 1.2 跟蹤、嗅探、采樣、剖析和可觀測(cè)性分別是什么 .......................................... 2 1.3 BCC、bpftrace和IO Visor ................................................................................... 3 1.4 初識(shí)BCC:快速上手 .......................................................................................... 4 1.5 BPF跟蹤的能見度 ............................................................................................... 7 1.6 動(dòng)態(tài)插樁:kprobes和uprobes ............................................................................ 8 1.7 靜態(tài)插樁:tracepoint和USDT ........................................................................... 9 1.8 初識(shí)bpftrace:跟蹤open() ................................................................................ 10 1.9 再回到BCC:跟蹤open() ................................................................................. 13 1.10 小結(jié) .................................................................................................................. 15 第2章 技術(shù)背景...........................................................................................................16 2.1 圖釋BPF ............................................................................................................. 16 2.2 BPF ..................................................................................................................... 17 2.3 擴(kuò)展版BPF ......................................................................................................... 18 2.3.1 為什么性能工具需要 BPF 技術(shù) ......................................................... 21 2.3.2 BPF 與內(nèi)核模塊的對(duì)比 ......................................................................23 2.3.3 編寫 BPF 程序 .....................................................................................23 2.3.4 使用 BPF 查看指令集 :bpftool ..........................................................24 2.3.5 使用 bpftrace 查看 BPF 指令集 .......................................................... 32 2.3.6 BPF API ................................................................................................ 33 2.3.7 BPF 并發(fā)控制 ......................................................................................37 2.3.8 BPF sysfs 接口 ..................................................................................... 38 2.3.9 BPF 類型格式 ......................................................................................38 2.3.10 BPF CO-RE ........................................................................................ 39 2.3.11 BPF 的局限性 ..................................................................................... 40 2.3.12 BPF 擴(kuò)展閱讀資料 ............................................................................40 2.4 調(diào)用;厮 ........................................................................................................ 41 2.4.1 基于幀指針的調(diào)用;厮 ................................................................... 41 2.4.2 調(diào)試信息 ............................................................................................... 42 2.4.3 最后分支記錄 ....................................................................................... 43 2.4.4 ORC ...................................................................................................... 43 2.4.5 符號(hào) ....................................................................................................... 43 2.4.6 擴(kuò)展閱讀 ............................................................................................... 43 2.5 火焰圖 ................................................................................................................ 44 2.5.1 調(diào)用棧信息 ........................................................................................... 44 2.5.2 對(duì)調(diào)用棧信息的剖析 ........................................................................... 44 2.5.3 火焰圖 ................................................................................................... 45 2.5.4 火焰圖的特性 ....................................................................................... 47 2.5.5 火焰圖的變體 ....................................................................................... 48 2.6 事件源 ................................................................................................................ 48 2.7 kprobes ............................................................................................................... 49 2.7.1 kprobes 是如何工作的 ......................................................................... 49 2.7.2 kprobes 接口 ......................................................................................... 51 2.7.3 BPF 和 kprobes .....................................................................................51 2.7.4 關(guān)于 kprobes 的更多內(nèi)容 .................................................................... 53 2.8 uprobes ............................................................................................................... 53 2.8.1 uprobes 是如何工作的 ......................................................................... 53 2.8.2 uprobes 接口 ......................................................................................... 55 2.8.3 BPF 與 uprobes .....................................................................................55 2.8.4 uprobes 的開銷和未來的工作 ............................................................. 56 2.8.5 擴(kuò)展閱讀 ............................................................................................... 57 2.9 跟蹤點(diǎn) ................................................................................................................ 57 2.9.1 如何添加跟蹤點(diǎn) ................................................................................... 58 2.9.2 跟蹤點(diǎn)的工作原理 ............................................................................... 59 2.9.3 跟蹤點(diǎn)的接口 ....................................................................................... 60 2.9.4 跟蹤點(diǎn)和 BPF ...................................................................................... 61 2.9.5 BPF 原始跟蹤點(diǎn) ..................................................................................62 2.9.6 擴(kuò)展閱讀 ............................................................................................... 62 2.10 USDT ............................................................................................................... 62 2.10.1 添加 USDT 探針 ................................................................................ 63 2.10.2 USDT 是如何工作的 ......................................................................... 65 2.10.3 BPF 與 USDT ..................................................................................... 66 2.10.4 USDT 的更多信息 ............................................................................. 66 2.11 動(dòng)態(tài)USDT ........................................................................................................ 66 2.12 性能監(jiān)控計(jì)數(shù)器 .............................................................................................. 68 2.12.1 PMC 的模式 .......................................................................................68 2.12.2 PEBS ................................................................................................... 69 2.12.3 云計(jì)算 ................................................................................................. 69 2.13 perf_events ....................................................................................................... 69 2.14 小結(jié) .................................................................................................................. 70 第3章 性能分析...........................................................................................................71 3.1 概覽 .................................................................................................................... 71 3.1.1 目標(biāo) ....................................................................................................... 71 3.1.2 分析工作 ............................................................................................... 72 3.1.3 多重性能問題 ....................................................................................... 73 3.2 性能分析方法論 ................................................................................................ 73 3.2.1 業(yè)務(wù)負(fù)載畫像 ....................................................................................... 74 3.2.2 下鉆分析 ............................................................................................... 75 3.2.3 USE 方法論 .......................................................................................... 76 3.2.4 檢查清單法 ........................................................................................... 77 3.3 Linux 60秒分析 ................................................................................................. 77 3.3.1 uptime ................................................................................................... 77 3.3.2 dmesg | tail ............................................................................................ 78 3.3.3 vmstat 1 ................................................................................................. 78 3.3.4 mpstat -P ALL 1 ....................................................................................79 3.3.5 pidstat 1 ................................................................................................. 80 3.3.6 iostat -xz 1 ............................................................................................. 80 3.3.7 free -m ................................................................................................... 82 3.3.8 sar -n DEV 1 .........................................................................................82 3.3.9 sar -n TCP,ETCP 1 ................................................................................ 83 3.3.10 top ....................................................................................................... 83 3.4 BCC工具檢查清單 ............................................................................................ 84 3.4.1 execsnoop .............................................................................................. 84 3.4.2 opensnoop ............................................................................................. 85 3.4.3 ext4slower ............................................................................................. 85 3.4.4 biolatency .............................................................................................. 86 3.4.5 biosnoop ................................................................................................ 86 3.4.6 cachestat ................................................................................................ 87 3.4.7 tcpconnect .............................................................................................87 3.4.8 tcpaccept ............................................................................................... 87 3.4.9 tcpretrans ............................................................................................... 88 3.4.10 runqlat ................................................................................................. 88 3.4.11 profile .................................................................................................. 89 3.5 小結(jié) .................................................................................................................... 90 第4章 BCC..................................................................................................................91 4.1 BCC的組件 ........................................................................................................ 92 4.2 BCC的特性 ........................................................................................................ 92 4.2.1 BCC 的內(nèi)核態(tài)特性 .............................................................................. 92 4.2.2 BCC 的用戶態(tài)特性 .............................................................................. 93 4.3 安裝BCC ............................................................................................................ 94 4.3.1 內(nèi)核要求 ............................................................................................... 94 4.3.2 Ubuntu ................................................................................................... 94 4.3.3 RHEL .................................................................................................... 95 4.3.4 其他發(fā)行版 ........................................................................................... 95 4.4 BCC的工具 ........................................................................................................ 96 4.4.1 重點(diǎn)工具 ............................................................................................... 96 4.4.2 工具的特點(diǎn) ........................................................................................... 97 4.4.3 單一用途工具 ....................................................................................... 98 4.4.4 多用途工具 ........................................................................................... 99 4.5 funccount .......................................................................................................... 100 4.5.1 funccount 的示例 ................................................................................ 101 4.5.2 funccount 的語法 ................................................................................ 103 4.5.3 funccount 的單行程序 ........................................................................ 103 4.5.4 funccount 的幫助信息 ........................................................................ 104 4.6 stackcount ......................................................................................................... 105 4.6.1 stackcount 的示例 .............................................................................. 105 4.6.2 stackcount 的火焰圖 .......................................................................... 107 4.6.3 stackcount 殘缺的調(diào)用棧 .................................................................. 108 4.6.4 stackcount 的語法 .............................................................................. 108 4.6.5 stackcount 的單行程序 ...................................................................... 109 4.6.6 stackcount 的幫助信息 ...................................................................... 109 4.7 trace .................................................................................................................. 110 4.7.1 trace 的示例 .........................................................................................111 4.7.2 trace 的語法 .........................................................................................111 4.7.3 trace 的單行程序 ................................................................................ 113 4.7.4 trace 的結(jié)構(gòu)體 .................................................................................... 113 4.7.5 trace 調(diào)試文件描述符泄露問題 ........................................................ 114 4.7.6 trace 的幫助信息 ................................................................................ 115 4.8 argdist ............................................................................................................... 117 4.8.1 argdist 的語法 ..................................................................................... 118 4.8.2 argdist 的單行程序 ............................................................................. 119 4.8.3 argdist 的幫助信息 ............................................................................. 119 4.9 工具文檔 .......................................................................................................... 121 4.9.1 man 幫助文檔 :opensnoop ............................................................... 121 4.9.2 示例文件 :opensnoop ....................................................................... 125 4.10 開發(fā)BCC工具 ................................................................................................ 126 4.11 BCC的內(nèi)部實(shí)現(xiàn) ............................................................................................ 127 4.12 BCC的調(diào)試 .................................................................................................... 128 4.12.1 printf() 調(diào)試 ...................................................................................... 129 4.12.2 BCC 調(diào)試輸出 ..................................................................................131 4.12.3 BCC 的調(diào)試標(biāo)志位 .......................................................................... 132 4.12.4 bpflist ................................................................................................. 133 4.12.5 bpftool ............................................................................................... 134 4.12.6 dmesg ................................................................................................ 134 4.12.7 重置事件 ........................................................................................... 134 4.13 小結(jié) ................................................................................................................ 136 第5章 bpftrace...........................................................................................................137 5.1 bpftrace的組件 ................................................................................................. 138 5.2 bpftrace的特性 ................................................................................................. 139 5.2.1 bpftrace 的事件源 .............................................................................. 139 5.2.2 bpftrace 的動(dòng)作 .................................................................................. 139 5.2.3 bpftrace 的一般特性 .......................................................................... 140 5.2.4 bpftrace 與其他觀測(cè)工具的比較 ...................................................... 140 5.3 bpftrace的安裝 ................................................................................................. 141 5.3.1 內(nèi)核版本要求 ..................................................................................... 141 5.3.2 Ubuntu ................................................................................................. 142 5.3.3 Fedora ................................................................................................. 142 5.3.4 構(gòu)建后的安裝步驟 ............................................................................. 143 5.3.5 其他發(fā)行版 ......................................................................................... 143 5.4 bpftrace工具 ..................................................................................................... 143 5.4.1 重點(diǎn)工具 ............................................................................................. 144 5.4.2 工具特征 ............................................................................................. 144 5.4.3 工具的運(yùn)行 ......................................................................................... 145 5.5 bpftrace單行程序 ............................................................................................. 145 5.6 bpftrace的文檔 ................................................................................................. 146 5.7 bpftrace編程 ..................................................................................................... 146 5.7.1 用法 ..................................................................................................... 147 5.7.2 程序結(jié)構(gòu) ............................................................................................. 148 5.7.3 注釋 ..................................................................................................... 148 5.7.4 探針格式 ............................................................................................. 149 5.7.5 探針通配符 ......................................................................................... 149 5.7.6 過濾器 ................................................................................................. 150 5.7.7 動(dòng)作 ..................................................................................................... 150 5.7.8 Hello, World! .......................................................................................151 5.7.9 函數(shù) ..................................................................................................... 151 5.7.10 變量 ................................................................................................... 152 5.7.11 映射表函數(shù) ....................................................................................... 153 5.7.12 對(duì) vfs_read() 計(jì)時(shí) ............................................................................ 154 5.8 bpftrace的幫助信息 ......................................................................................... 155 5.9 bpftrace的探針類型 ......................................................................................... 157 5.9.1 tracepoint ............................................................................................. 157 5.9.2 usdt ...................................................................................................... 159 5.9.3 kprobe 和 kretprobe ............................................................................160 5.9.4 uprobe 和 uretprobe ............................................................................ 160 5.9.5 software 和 hardware ..........................................................................161 5.9.6 profile 和 interval ................................................................................ 162 5.10 bpftrace的控制流 ........................................................................................... 163 5.10.1 過濾器 ............................................................................................... 163 5.10.2 三元操作符 ....................................................................................... 163 5.10.3 if 語句 ............................................................................................... 163 5.10.4 循環(huán)展開 ........................................................................................... 164 5.11 bpftrace的運(yùn)算符 ........................................................................................... 164 5.12 bpftrace的變量 ............................................................................................... 165 5.12.1 內(nèi)置變量 ........................................................................................... 165 5.12.2 內(nèi)置變量 :pid、comm 和 uid ........................................................166 5.12.3 內(nèi)置變量 :kstack 和 ustack ............................................................166 5.12.4 內(nèi)置變量 :位置參數(shù) ....................................................................... 168 5.12.5 臨時(shí)變量 ........................................................................................... 169 5.12.6 映射表變量 ....................................................................................... 169 5.13 bpftrace的函數(shù) ............................................................................................... 170 5.13.1 printf() ............................................................................................... 171 5.13.2 join() .................................................................................................. 172 5.13.3 str() .................................................................................................... 173 5.13.4 kstack() 和 ustack() ........................................................................... 173 5.13.5 ksym() 和 usym() .............................................................................. 174 5.13.6 kaddr() 和 uaddr() .............................................................................175 5.13.7 system() ............................................................................................. 176 5.13.8 exit() .................................................................................................. 176 5.14 bpftrace映射表的操作函數(shù) ........................................................................... 177 5.14.1 count() ............................................................................................... 177 5.14.2 sum()、avg()、min() 和 max() ........................................................178 5.14.3 hist() .................................................................................................. 179 5.14.4 lhist() ................................................................................................. 180 5.14.5 delete() .............................................................................................. 181 5.14.6 clear() 和 zero() .................................................................................181 5.14.7 print() ................................................................................................ 182 5.15 bpftrace的下一步工作 ................................................................................... 183 5.15.1 顯式區(qū)分地址模式 ........................................................................... 183 5.15.2 其他擴(kuò)展 ........................................................................................... 184 5.15.3 ply ..................................................................................................... 184 5.16 bpftrace的內(nèi)部運(yùn)作 ....................................................................................... 185 5.17 bpftrace的調(diào)試 ............................................................................................... 186 5.17.1 printf() 調(diào)試 ...................................................................................... 186 5.17.2 調(diào)試模式 ........................................................................................... 187 5.17.3 詳情模式 ........................................................................................... 188 5.18 小結(jié) ................................................................................................................ 190 第6章 CPU................................................................................................................191 6.1 背景知識(shí) .......................................................................................................... 192 6.1.1 CPU 基礎(chǔ)知識(shí) ....................................................................................192 6.1.2 BPF 的分析能力 ................................................................................194 6.1.3 分析策略 ............................................................................................. 196 6.2 傳統(tǒng)工具 .......................................................................................................... 197 6.2.1 內(nèi)核統(tǒng)計(jì) ............................................................................................. 197 6.2.2 硬件統(tǒng)計(jì) ............................................................................................. 200 6.2.3 硬件采樣 ............................................................................................. 202 6.2.4 定時(shí)采樣 ............................................................................................. 203 6.2.5 事件統(tǒng)計(jì)與事件跟蹤 ......................................................................... 207 6.3 BPF工具 ........................................................................................................... 210 6.3.1 execsnoop ............................................................................................ 211 6.3.2 exitsnoop ............................................................................................. 214 6.3.3 runqlat ................................................................................................. 215 6.3.4 runqlen ................................................................................................ 219 6.3.5 runqslower ........................................................................................... 222 6.3.6 cpudist ................................................................................................. 223 6.3.7 cpufreq ................................................................................................ 224 6.3.8 profile .................................................................................................. 227 6.3.9 offcputime ........................................................................................... 232 6.3.10 syscount ............................................................................................. 236 6.3.11 argdist 和 trace ..................................................................................239 6.3.12 funccount ........................................................................................... 242 6.3.13 softirqs ............................................................................................... 244 6.3.14 hardirqs ............................................................................................. 245 6.3.15 smpcalls ............................................................................................. 246 6.3.16 llcstat ................................................................................................. 250 6.3.17 其他工具 ........................................................................................... 251 6.4 BPF單行程序 ................................................................................................... 251 6.4.1 BCC 工具 ............................................................................................251 6.4.2 bpftrace 版本 ...................................................................................... 252 6.5 可選練習(xí) .......................................................................................................... 253 6.6 小結(jié) .................................................................................................................. 254 第7章 內(nèi)存................................................................................................................255 7.1 背景知識(shí) .......................................................................................................... 256 7.1.1 內(nèi)存基礎(chǔ)知識(shí) ..................................................................................... 256 7.1.2 BPF 的分析能力 ................................................................................260 7.1.3 分析策略 ............................................................................................. 262 7.2 傳統(tǒng)工具 .......................................................................................................... 263 7.2.1 內(nèi)核日志 ............................................................................................. 263 7.2.2 內(nèi)核統(tǒng)計(jì)信息 ..................................................................................... 264 7.2.3 硬件統(tǒng)計(jì)和硬件采樣 ......................................................................... 268 7.3 BPF工具 ........................................................................................................... 269 7.3.1 oomkill ................................................................................................ 270 7.3.2 memleak .............................................................................................. 271 7.3.3 mmapsnoop ......................................................................................... 274 7.3.4 brkstack ............................................................................................... 275 7.3.5 shmsnoop ............................................................................................ 277 7.3.6 faults .................................................................................................... 277 7.3.7 ffaults .................................................................................................. 280 7.3.8 vmscan ................................................................................................ 281 7.3.9 drsnoop ................................................................................................ 284 7.3.10 swapin ............................................................................................... 285 7.3.11 hfaults ................................................................................................ 287 7.3.12 其他工具 ........................................................................................... 287 7.4 BPF單行程序 ................................................................................................... 288 7.4.1 BCC ..................................................................................................... 288 7.4.2 bpftrace ............................................................................................... 288 7.5 可選練習(xí) .......................................................................................................... 289 7.6 小結(jié) .................................................................................................................. 290 第8章 文件系統(tǒng).........................................................................................................291 8.1 背景知識(shí) .......................................................................................................... 292 8.1.1 文件系統(tǒng)基礎(chǔ)知識(shí) ............................................................................. 292 8.1.2 BPF 的分析能力 ................................................................................294 8.1.3 分析策略 ............................................................................................. 295 8.2 傳統(tǒng)工具 .......................................................................................................... 296 8.2.1 df ......................................................................................................... 297 8.2.2 mount .................................................................................................. 297 8.2.3 strace ................................................................................................... 298 8.2.4 perf ...................................................................................................... 298 8.2.5 fatrace .................................................................................................. 301 8.3 BPF工具 ........................................................................................................... 302 8.3.1 opensnoop ........................................................................................... 303 8.3.2 statsnoop ............................................................................................. 306 8.3.3 syncsnoop ............................................................................................ 308 8.3.4 mmapfiles ............................................................................................ 309 8.3.5 scread .................................................................................................. 311 8.3.6 fmapfault ............................................................................................. 312 8.3.7 filelife .................................................................................................. 313 8.3.8 vfsstat .................................................................................................. 315 8.3.9 vfscount ............................................................................................... 317 8.3.10 vfssize ............................................................................................... 318 8.3.11 fsrwstat .............................................................................................. 320 8.3.12 fileslower ........................................................................................... 322 8.3.13 filetop ................................................................................................ 325 8.3.14 writesync ........................................................................................... 327 8.3.15 filetype ............................................................................................... 328 8.3.16 cachestat ............................................................................................ 331 8.3.17 writeback ........................................................................................... 334 8.3.18 dcstat ................................................................................................. 336 8.3.19 dcsnoop ............................................................................................. 338 8.3.20 mountsnoop ....................................................................................... 340 8.3.21 xfsslower ........................................................................................... 341 8.3.22 xfsdist ................................................................................................ 342 8.2.23 ext4dist .............................................................................................. 345 8.3.24 icstat .................................................................................................. 348 8.3.25 bufgrow ............................................................................................. 350 8.3.26 readahead .......................................................................................... 351 8.3.27 其他工具 ........................................................................................... 353 8.4 BPF單行程序 ................................................................................................... 353 8.4.1 BCC ..................................................................................................... 353 8.4.2 bpftrace ............................................................................................... 354 8.4.3 BPF 單行程序示范 ............................................................................356 8.5 可選練習(xí) .......................................................................................................... 359 8.6 小結(jié) .................................................................................................................. 360 第9章 磁盤I/O............................................................................................................361 9.1 背景知識(shí) .......................................................................................................... 362 9.1.1 磁盤系統(tǒng)基礎(chǔ)知識(shí) ............................................................................. 362 9.1.2 BPF 的分析能力 ................................................................................365 9.1.3 分析策略 ............................................................................................. 366 9.2 傳統(tǒng)工具 .......................................................................................................... 367 9.2.1 iostat .................................................................................................... 367 9.2.2 perf ...................................................................................................... 369 9.2.3 blktrace ................................................................................................ 370 9.2.4 SCSI 日志 ...........................................................................................371 9.3 BPF工具 ........................................................................................................... 372 9.3.1 biolatency ............................................................................................373 9.3.2 biosnoop .............................................................................................. 379 9.3.3 biotop .................................................................................................. 383 9.3.4 bitesize ................................................................................................ 384 9.3.5 seeksize ............................................................................................... 386 9.3.6 biopattern ............................................................................................ 388 9.3.7 biostacks .............................................................................................. 390 9.3.8 bioerr ................................................................................................... 393 9.3.9 mdflush ................................................................................................ 395 9.3.10 iosched .............................................................................................. 397 9.3.11 scsilatency ......................................................................................... 399 9.3.12 scsiresult ............................................................................................ 401 9.3.13 nvmelatency ......................................................................................403 9.4 BPF單行程序 ................................................................................................... 406 9.4.1 BCC ..................................................................................................... 406 9.4.2 bpftrace ............................................................................................... 407 9.4.3 BPF 單行程序示范 ............................................................................408 9.5 可選練習(xí) .......................................................................................................... 409 9.6 小結(jié) .................................................................................................................. 410 第10章 網(wǎng)絡(luò)..............................................................................................................411 10.1 背景知識(shí) ........................................................................................................ 412 10.1.1 網(wǎng)絡(luò)基礎(chǔ)知識(shí) ................................................................................... 412 10.1.2 BPF 的分析能力 ..............................................................................419 10.1.3 分析策略 ........................................................................................... 421 10.1.4 常見的跟蹤錯(cuò)誤 ............................................................................... 421 10.2 傳統(tǒng)工具 ........................................................................................................ 422 10.2.1 ss ....................................................................................................... 423 10.2.2 ip ....................................................................................................... 424 10.2.3 nstat ................................................................................................... 425 10.2.4 netstat ................................................................................................ 425 10.2.5 sar ...................................................................................................... 428 10.2.6 nicstat ................................................................................................ 429 10.2.7 ethtool ............................................................................................... 429 10.2.8 tcpdump ............................................................................................. 431 10.2.9 /proc .................................................................................................. 432 10.3 BPF工具 ......................................................................................................... 433 10.3.1 sockstat .............................................................................................. 435 10.3.2 sofamily ............................................................................................. 437 10.3.3 soprotocol .......................................................................................... 440 10.3.4 soconnect ........................................................................................... 442 10.3.5 soaccept .............................................................................................445 10.3.6 socketio ............................................................................................. 447 10.3.7 socksize ............................................................................................. 450 10.3.8 sormem .............................................................................................. 452 10.3.9 soconnlat ........................................................................................... 455 10.3.10 so1stbyte ......................................................................................... 459 10.3.11 tcpconnect ....................................................................................... 461 10.3.12 tcpaccept .........................................................................................464 10.3.13 tcplife .............................................................................................. 467 10.3.14 tcptop ............................................................................................... 472 10.3.15 tcpsnoop .......................................................................................... 473 10.3.16 tcpretrans .........................................................................................474 10.3.17 tcpsynbl ........................................................................................... 477 10.3.18 tcpwin .............................................................................................. 479 10.3.19 tcpnagle ...........................................................................................481 10.3.20 udpconnect ...................................................................................... 483 10.3.21 gethostlatency ................................................................................. 485 10.3.22 ipecn ................................................................................................ 487 10.3.23 superping ......................................................................................... 488 10.3.24 qdisc-fq ........................................................................................... 491 10.3.25 qdisc-cbq、qdisc-cbs、qdisc-codel、qdisc-fq_codel、qdisc-red、qdisc-tbf ....................................................................... 493 10.3.26 netsize ............................................................................................. 495 10.3.27 nettxlat ............................................................................................. 498 10.3.28 skbdrop ............................................................................................ 500 10.3.29 skblife .............................................................................................. 503 10.3.30 ieee80211scan .................................................................................505 10.3.31 其他工具 ......................................................................................... 507 10.4 BPF單行程序 ................................................................................................. 507 10.4.1 BCC ................................................................................................... 507 10.4.2 bpftrace ............................................................................................. 508 10.4.3 BPF 單行程序示范 ..........................................................................510 10.5 可選練習(xí) ........................................................................................................ 513 10.6 小結(jié) ................................................................................................................ 515 第11章 安全..............................................................................................................516 11.1 背景知識(shí) ........................................................................................................ 516 11.1.1 BPF 的分析能力 ............................................................................... 517 11.1.2 無特權(quán) BPF 用戶 .............................................................................. 521 11.1.3 配置 BPF 安全策略 .......................................................................... 521 11.1.4 分析策略 ........................................................................................... 523 11.2 BPF工具 ......................................................................................................... 523 11.2.1 execsnoop .......................................................................................... 524 11.2.2 elfsnoop ............................................................................................. 524 11.2.3 modsnoop .......................................................................................... 526 11.2.4 bashreadline....................................................................................... 527 11.2.5 shellsnoop .......................................................................................... 528 11.2.6 ttysnoop ............................................................................................. 530 11.2.7 opensnoop ......................................................................................... 532 11.2.8 eperm ................................................................................................. 532 11.2.9 tcpconnect 和 tcpaccept .................................................................... 534 11.2.10 tcpreset ............................................................................................ 534 11.2.11 capable ............................................................................................. 536 11.2.12 setuids .............................................................................................. 540 11.3 BPF單行程序 ................................................................................................. 542 11.3.1 BCC ................................................................................................... 542 11.3.2 bpftrace ..............................................................................................543 11.3.3 BPF 單行程序示范 ........................................................................... 543 11.4 小結(jié) ................................................................................................................ 544 第12章 編程語言.......................................................................................................545 12.1 背景知識(shí) ........................................................................................................ 545 12.1.1 編譯型語言 ....................................................................................... 546 12.1.2 即時(shí)編譯型語言 ............................................................................... 547 12.1.3 解釋型語言 ....................................................................................... 548 12.1.4 BPF 的分析能力 ..............................................................................549 12.1.5 分析策略 ........................................................................................... 550 12.1.6 BPF 工具 ..........................................................................................550 12.2 C ..................................................................................................................... 551 12.2.1 C 函數(shù)符號(hào) ....................................................................................... 552 12.2.2 C 調(diào)用棧 ........................................................................................... 555 12.2.3 C 函數(shù)跟蹤 ....................................................................................... 557 12.2.4 C 函數(shù)偏移量跟蹤 ........................................................................... 558 12.2.5 C USDT ............................................................................................. 558 12.2.6 C 單行程序 ....................................................................................... 559 12.3 Java ................................................................................................................. 560 12.3.1 跟蹤 libjvm ....................................................................................... 561 12.3.2 jnistacks ............................................................................................. 563 12.3.3 Java 線程名字 .................................................................................. 565 12.3.4 Java 方法的符號(hào) .............................................................................. 566 12.3.5 Java 調(diào)用棧 ...................................................................................... 569 12.3.6 Java USDT 探針 ............................................................................... 573 12.3.7 profile ................................................................................................ 579 12.3.8 offcputime ......................................................................................... 583 12.3.9 stackcount ......................................................................................... 589 12.3.10 javastat ............................................................................................ 593 12.3.11 javathreads....................................................................................... 594 12.3.12 javacalls ........................................................................................... 596 12.3.13 javaflow ........................................................................................... 597 12.3.14 javagc .............................................................................................. 599 12.3.15 javaobjnew ...................................................................................... 599 12.3.16 Java 單行程序 ................................................................................ 600 12.4 bash shell ........................................................................................................ 601 12.4.1 函數(shù)計(jì)數(shù) ........................................................................................... 603 12.4.2 函數(shù)參數(shù)跟蹤(bashfunc.bt) .......................................................... 604 12.4.3 函數(shù)執(zhí)行時(shí)長(zhǎng)(bashfunclat.bt) ...................................................... 607 12.4.4 /bin/bash ............................................................................................ 609 12.4.5 /bin/bash USDT ................................................................................. 613 12.4.6 bash 單行程序 .................................................................................. 613 12.5 其他語言 ........................................................................................................ 614 12.5.1 JavaScript(Node.js) ........................................................................ 614 12.5.2 C++ ................................................................................................... 616 12.5.3 Golang ............................................................................................... 616 12.6 小結(jié) ................................................................................................................ 619 第13章 應(yīng)用程序.......................................................................................................620 13.1 背景知識(shí) ........................................................................................................ 621 13.1.1 應(yīng)用程序基礎(chǔ)信息 ........................................................................... 621 13.1.2 應(yīng)用程序示例 :MySQL 服務(wù)器 ....................................................622 13.1.3 BPF 的能力 ......................................................................................623 13.1.4 分析策略 ........................................................................................... 624 13.2 BPF工具 ......................................................................................................... 625 13.2.1 execsnoop .......................................................................................... 626 13.2.2 threadsnoop ....................................................................................... 626 13.2.3 profile ................................................................................................ 629 13.2.4 threaded ............................................................................................. 632 13.2.5 offcputime ......................................................................................... 634 13.2.6 offcpuhist .......................................................................................... 638 13.2.7 syscount ............................................................................................. 641 13.2.8 ioprofile ............................................................................................. 642 13.2.9 libc 幀指針 ........................................................................................ 644 13.2.10 mysqld_qslower .............................................................................. 645 13.2.11 mysqld_clat ..................................................................................... 648 13.2.12 signals ............................................................................................. 652 13.2.13 killsnoop .......................................................................................... 654 13.2.14 pmlock 和 pmheld ........................................................................... 655 13.2.15 naptime ............................................................................................ 660 13.2.16 其他工具 ......................................................................................... 662 13.3 BPF單行程序 ................................................................................................. 662 13.3.1 BCC ................................................................................................... 662 13.3.2 bpftrace ............................................................................................. 663 13.4 BPF單行程序示范 ......................................................................................... 664 13.5 小結(jié) ................................................................................................................ 664 第14章 內(nèi)核..............................................................................................................665 14.1 背景知識(shí) ........................................................................................................ 666 14.1.1 內(nèi)核基礎(chǔ)知識(shí) ................................................................................... 666 14.1.2 BPF 的分析能力 ..............................................................................668 14.2 分析策略 ........................................................................................................ 669 14.3 傳統(tǒng)工具 ........................................................................................................ 670 14.3.1 Ftrace ................................................................................................. 670 14.3.2 perf sched .......................................................................................... 673 14.3.3 slabtop ............................................................................................... 674 14.3.4 其他工具 ........................................................................................... 675 14.4 BPF工具 ......................................................................................................... 675 14.4.1 loads .................................................................................................. 676 14.4.2 offcputime ......................................................................................... 677 14.4.3 wakeuptime .......................................................................................679 14.4.4 offwaketime ...................................................................................... 681 14.4.5 mlock 和 mheld ................................................................................. 683 14.4.6 自旋鎖 ............................................................................................... 687 14.4.7 kmem ................................................................................................. 688 14.4.8 kpages ............................................................................................... 689 14.4.9 memleak ............................................................................................690 14.4.10 slabratetop ....................................................................................... 691 14.4.11 numamove ....................................................................................... 692 14.4.12 workq .............................................................................................. 694 14.4.13 小任務(wù) ............................................................................................. 695 14.4.14 其他工具 ......................................................................................... 696 14.5 BPF單行程序 ................................................................................................. 697 14.5.1 BCC ................................................................................................... 697 14.5.2 bpftrace ............................................................................................. 698 14.6 BPF單行程序示范 ......................................................................................... 699 14.6.1 按系統(tǒng)調(diào)用函數(shù)對(duì)系統(tǒng)調(diào)用進(jìn)行計(jì)數(shù) ........................................... 699 14.6.2 對(duì)內(nèi)核函數(shù)開始的 hrtimer 進(jìn)行計(jì)數(shù) ............................................. 699 14.7 挑戰(zhàn) ................................................................................................................ 700 14.8 小結(jié) ................................................................................................................ 700 第15章 容器..............................................................................................................701 15.1 背景知識(shí) ........................................................................................................ 701 15.1.1 BPF 的分析能力 ..............................................................................703 15.1.2 挑戰(zhàn) ................................................................................................... 703 15.1.3 分析策略 ........................................................................................... 706 15.2 傳統(tǒng)工具 ........................................................................................................ 706 15.2.1 從主機(jī)上分析 ................................................................................... 706 15.2.2 在容器內(nèi)分析 ................................................................................... 707 15.2.3 systemd-cgtop ................................................................................... 707 15.2.4 kubectl top ......................................................................................... 708 15.2.5 docker stats ........................................................................................ 708 15.2.6 /sys/fs/cgroups ................................................................................... 709 15.2.7 perf .................................................................................................... 709 15.3 BPF工具 ......................................................................................................... 710 15.3.1 runqlat ............................................................................................... 710 15.3.2 pidnss ................................................................................................ 711 15.3.3 blkthrot .............................................................................................. 714 15.3.4 overlayfs ............................................................................................ 715 15.4 BPF單行程序 ................................................................................................. 717 15.5 可選練習(xí) ........................................................................................................ 717 15.6 小結(jié) ................................................................................................................ 718 第16章 虛擬機(jī)管理器................................................................................................719 16.1 背景知識(shí) ........................................................................................................ 719 16.1.1 BPF 的分析能力 ..............................................................................721 16.1.2 建議的分析策略 ............................................................................... 722 16.2 傳統(tǒng)工具 ........................................................................................................ 722 16.3 訪客系統(tǒng)的BPF工具 ..................................................................................... 723 16.3.1 Xen 超級(jí)調(diào)用 ................................................................................... 723 16.3.2 xenhyper ............................................................................................ 727 16.3.3 Xen 回調(diào) ........................................................................................... 729 16.3.4 cpustolen ........................................................................................... 731 16.3.5 HVM 退出跟蹤 ................................................................................ 732 16.4 宿主機(jī)BPF工具 ............................................................................................. 732 16.4.1 kvmexits ............................................................................................ 733 16.4.2 未來的工作 ....................................................................................... 737 16.5 小結(jié) ................................................................................................................ 737 第17章 其他BPF性能工具.........................................................................................738 17.1 Vector和Performance Co-Pilot(PCP) ....................................................... 738 17.1.1 可視化 ............................................................................................... 739 17.1.2 可視化 :熱圖 ................................................................................... 740 17.1.3 可視化 :表格形式的數(shù)據(jù) ............................................................... 742 17.1.4 BCC 提供的指標(biāo) ..............................................................................743 17.1.5 內(nèi)部實(shí)現(xiàn) ........................................................................................... 743 17.1.6 安裝 PCP 和 Vector ..........................................................................744 17.1.7 連接并顯示數(shù)據(jù) ............................................................................... 744 17.1.8 配置 BCC PMDA ............................................................................. 746 17.1.9 改進(jìn)工作 ........................................................................................... 747 17.1.10 進(jìn)一步閱讀 ..................................................................................... 747 17.2 Grafana和Performance Co-Pilot .................................................................... 747 17.2.1 安裝和配置 ....................................................................................... 748 17.2.2 連接并查看數(shù)據(jù) ............................................................................... 748 17.2.3 改進(jìn)工作 ........................................................................................... 750 17.2.4 進(jìn)一步閱讀 ....................................................................................... 750 17.3 Cloudflare eBPF Prometheus Exporter(配合Grafana) ............................. 750 17.3.1 構(gòu)建并運(yùn)行 ebpf 導(dǎo)出器 ................................................................. 750 17.3.2 配置 Prometheus 監(jiān)控 ebpf_exporter 實(shí)例 ..................................... 751 17.3.3 在 Grafana 中設(shè)置一個(gè)查詢 ............................................................ 751 17.3.4 進(jìn)一步閱讀 ....................................................................................... 751 17.4 kubectl-trace ................................................................................................... 752 17.4.1 跟蹤節(jié)點(diǎn) ........................................................................................... 752 17.4.2 跟蹤 pod 和容器 ............................................................................... 753 17.4.3 進(jìn)一步閱讀 ....................................................................................... 755 17.5 其他工具 ........................................................................................................ 755 17.6 小結(jié) ................................................................................................................ 755 第18章 建議、技巧和常見問題. .................................................................................756 18.1 典型事件的頻率和額外開銷 ........................................................................ 756 18.1.1 頻率 ................................................................................................... 757 18.1.2 執(zhí)行的操作 ....................................................................................... 758 18.1.3 自行測(cè)試 ........................................................................................... 760 18.2 以49Hz或99Hz為采樣頻率 .......................................................................... 760 18.3 黃豬和灰鼠 .................................................................................................... 760 18.4 開發(fā)目標(biāo)軟件 ................................................................................................ 762 18.5 學(xué)習(xí)系統(tǒng)調(diào)用 ................................................................................................ 763 18.6 保持簡(jiǎn)單 ........................................................................................................ 764 18.7 事件缺失 ........................................................................................................ 764 18.8 調(diào)用棧缺失 .................................................................................................... 766 18.8.1 如何修復(fù)損壞的調(diào)用棧 ................................................................... 767 18.9 打印時(shí)符號(hào)缺失(函數(shù)名稱) .................................................................... 767 18.9.1 如何修復(fù)符號(hào)缺失 :JIT 運(yùn)行時(shí)(Java、Node.js、...) ................ 768 18.9.2 如何修復(fù)符號(hào)缺失 :ELF 二進(jìn)制文件(C、C++、...) ................ 768 18.10 跟蹤時(shí)函數(shù)缺失 .......................................................................................... 768 18.11 反饋回路 ...................................................................................................... 769 18.12 被丟掉的事件 .............................................................................................. 769 附錄A bpftrace單行程序............................................................................................770 附錄B bpftrace備忘單...............................................................................................775 附錄C BCC工具的開發(fā).............................................................................................778 附錄D C.BPF. ............................................................................................................793 附錄E BPF指令.........................................................................................................812
你還可能感興趣
我要評(píng)論
|