Category: linux

  • linux下获取进程线程情况

    linux 没有真正的线程, linux 上的线程是轻量级进程LWP.

    The main difference between a light weight process (LWP) and a normal process is that LWPs share the same address space and other resources like open files etc. As some resources are shared so these processes are considered to be light weight as compared to other normal processes and hence the name light weight processes.

    获取进程的线程情况

    ➜  ~ ps -Lf 1210
    UID        PID  PPID   LWP  C NLWP STIME TTY      STAT   TIME CMD
    mysql     1210     1  1210  0   10 4月17 ?       Ssl    0:02 /usr/sbin/mariadbd
    mysql     1210     1  1478  0   10 4月17 ?       Ssl    3:24 /usr/sbin/mariadbd
    mysql     1210     1  1479  0   10 4月17 ?       Ssl    0:01 /usr/sbin/mariadbd
    mysql     1210     1  1480  0   10 4月17 ?       Ssl    0:00 /usr/sbin/mariadbd
    mysql     1210     1  1481  0   10 4月17 ?       Ssl    0:00 /usr/sbin/mariadbd
    mysql     1210     1  1492  0   10 4月17 ?       Ssl    0:00 /usr/sbin/mariadbd
    mysql     1210     1  1566  0   10 4月17 ?       Ssl    0:00 /usr/sbin/mariadbd
    mysql     1210     1  9933  0   10 5月16 ?       Ssl    0:00 /usr/sbin/mariadbd
    mysql     1210     1  9937  0   10 5月16 ?       Ssl    0:00 /usr/sbin/mariadbd
    mysql     1210     1 13703  0   10 00:11 ?        Ssl    0:00 /usr/sbin/mariadbd
    
    ➜  ~ ls -l /proc/1210/task/
    总用量 0
    dr-xr-xr-x 7 mysql mysql 0 5月  16 13:44 1210
    dr-xr-xr-x 7 mysql mysql 0 5月  17 00:12 13703
    dr-xr-xr-x 7 mysql mysql 0 5月  16 13:44 1478
    dr-xr-xr-x 7 mysql mysql 0 5月  16 13:44 1479
    dr-xr-xr-x 7 mysql mysql 0 5月  16 13:44 1480
    dr-xr-xr-x 7 mysql mysql 0 5月  16 13:44 1481
    dr-xr-xr-x 7 mysql mysql 0 5月  16 13:44 1492
    dr-xr-xr-x 7 mysql mysql 0 5月  16 13:44 1566
    dr-xr-xr-x 7 mysql mysql 0 5月  17 00:12 9933
    dr-xr-xr-x 7 mysql mysql 0 5月  17 00:12 9937
    
    ➜  ~ top -Hp 1210
    top - 00:18:19 up 29 days,  7:20,  1 user,  load average: 0.21, 0.15, 0.13
    Threads:  10 total,   0 running,  10 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    KiB Mem :  1881996 total,    85684 free,   900112 used,   896200 buff/cache
    KiB Swap:        0 total,        0 free,        0 used.   791844 avail Mem
    
      PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
     1210 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:02.17 mariadbd
     1478 mysql     20   0 1158504 130800   7732 S  0.0  7.0   3:24.05 mariadbd
     1479 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:01.59 mariadbd
     1480 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:00.00 mariadbd
     1481 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:00.05 mariadbd
     1492 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:00.00 mariadbd
     1566 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:00.00 mariadbd
     9933 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:00.22 mariadbd
     9937 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:00.24 mariadbd
    13703 mysql     20   0 1158504 130800   7732 S  0.0  7.0   0:00.00 mariadbd
    
    ➜  ~ pstree -p 1210
    mariadbd(1210)─┬─{mariadbd}(1478)
                   ├─{mariadbd}(1479)
                   ├─{mariadbd}(1480)
                   ├─{mariadbd}(1481)
                   ├─{mariadbd}(1492)
                   ├─{mariadbd}(1566)
                   ├─{mariadbd}(13703)
                   ├─{mariadbd}(15228)
                   ├─{mariadbd}(15229)
                   ├─{mariadbd}(15230)
                   └─{mariadbd}(15231)
    

    ref

    what-is-the-difference-between-lightweight-process-and-thread

  • 在Linux如何查看一个进程的内存情况

    经常使用 top 查看监控系统进程情况,但是里面很多指标还是不够清晰。
    * 为什么 VIRT 有时候会比系统内存还大
    * RES 比 配置jvm 的Xmx还大
    * top 中那些指标是关于内存的

    通过 top 命令

    CODEandDATA需要按F,然后使用空格键选中,才会显示出来

    top -p 1210
    
    PID USER      PR  NI    VIRT    RES    SHR   CODE    DATA   SWAP S %CPU %MEM     TIME+ COMMAND
    1210 mysql     20   0 1158504 130804   7736  22472 1052816      0 S  0.0  7.0   8:53.50 mariadbd
    

    VIRT

    36. VIRT  --  Virtual Memory Size (KiB)
         The total amount of virtual memory used by the task.  It includes all code, data and shared libraries plus pages that have been swapped out and pages that have  been mapped but not used.
    

    VIRT=CODE+DATA+shared libraries +pages that have been swapped out+pages that have been mapped but not used

    SWAP

    27. SWAP  --  Swapped Size (KiB)
            The non-resident portion of a task's address space.
    

    被 swap-out 的内存页大小

    RES

    17. RES  --  Resident Memory Size (KiB)
        The non-swapped physical memory a task is using.
    

    一个任务正在使用的,没有被swap-out 的物理内存

    CODE

     4. CODE  --  Code Size (KiB)
            The amount of physical memory devoted to executable code, also known as the Text Resident Set size or TRS.
    
    可执行代码驻留的物理内存总量,驻存代码集合(Text Resident Set, TRS)
    

    DATA

    6. DATA  --  Data + Stack Size (KiB)
         The amount of physical memory devoted to other than executable code, also known as the Data Resident Set size or DRS.
    

    data 是 VM_WRITE & ~VM_SHARED & ~VM_STACK 与 VM_STACK 占用内存页之和,也就是所有非栈内存中可写但非共享内存页与栈内存页之和

    $ANON = RES – SHR$ ( ANON 表示在堆上分配的内存)

    $ANON <= DATA$ (vm_physic)

    SHR

    21. SHR  --  Shared Memory Size (KiB)
            The amount of shared memory available to a task, not all of which is typically resident.  It simply reflects memory that could be potentially shared with other  processes.
    
            任务可用的共享内存量,但并非所有的共享内存都是常驻(resident)的。它(SHR)只是反映了可能与其他进程共享的内存
    

    SHR contains all virtual memory that could be shared with other processes, and RSS contains all memory physically in RAM that is used by the process.

    Thus all shared memory currently in RAM is counted both in SHR and in RSS, so SHR + RSS has no meaning since it can contain duplicates counts.(SHR + RSS没有意义,因为他们可能包含重复的项)

    1. 除了自身进程的共享内存,也包括其他进程的共享内存
    2. 虽然进程只使用了几个共享库的函数,但它包含了整个共享库的大小
    3. 计算某个进程所占的物理内存大小公式:RES – SHR
    4. swap out后,它将会降下来

    PR

    16. PR  --  Priority
        The scheduling priority of the task.  If you see `rt` in this field, it means the task is running under real time scheduling priority.
    
        Under linux, real time priority is somewhat misleading since traditionally the operating itself was not preemptible.  And while the 2.6 kernel  can  be  made  mostly preemptible, it is not always so.
        在linux下,实时优先级有些误导,因为传统上操作本身不是可抢占的。虽然2.6内核可以使其成为可抢占的,但并不总是如此。
    

    实时优先级取值范围是[1,99],它对应的就是0-99的优先级,优先级较高。

    In most cases PR value can be computed by the following formula: PR = 20 + NI.

    Theoretically the kernel can change PR value (but not NI) by itself. For example it may reduce the priority of a process if it consumes too much CPU, or it may increase the priority of a process if that process had no chance to run for a long time because of other higher priority processes. In these cases the PR value will be changed by kernel and NI will remain the same, thus the formula “PR = 20 + NI” will not be correct.

    NI

    11. NI  --  Nice Value
        The nice value of the task.  A negative nice value means higher priority, whereas a positive nice value means lower priority(值越小优先级越高).  Zero in this field simply means priority will not be adjusted in determining a task's dispatch-ability.
        任务的静态调度优先级,取值范围是[-20,19]。
    
    

    Linux实际上实现了140个优先级范围,取值范围是从0~139,这个值越小,优先级越高。nice值的-20到19,映射到实际的优先级范围是100-139。

    通过 proc filesystem

    cat /proc/1210/statm
    289626 32701 1934 5618 0 263204 0

    //os 内存页大小

    getconf PAGESIZE
    4096

    Table 1-3: Contents of the statm files (as of 2.6.8-rc3)

    Field Content 与 top 相关字段
    size total program size (pages) (same as VmSize in status) $$VIRT=289626*4096/1024=1158504$$
    resident size of memory portions (pages) (same as VmRSS in status) $$RES=32701*4096/1024 = 130804$$
    shared number of pages that are shared (i.e. backed by a file, same as RssFile+RssShmem in status) $$SHR=1934*4096/1024=7736$$
    trs number of pages that are ‘code’ (not including libs; broken, includes data segment) $$CODE=5618*4096/1024=22472$$
    lrs number of pages of library (always 0 on 2.6)
    drs number of pages of data/stack (including libs; broken, includes library text) $$DATA=263204*4096/1024=1052816$$
    dt number of dirty pages (always 0 on 2.6)

    pmap

    ➜  ~ pmap -X 1210|head -n 5
    1210:   /usr/sbin/mariadbd
             Address Perm   Offset Device Inode    Size    Rss    Pss Referenced Anonymous Swap Locked Mapping
        556335aee000 r-xp 00000000  fd:01 22182   22472   5228   5228       5172         0    0      0 mariadbd
        5563372df000 r--p 015f1000  fd:01 22182    1392   1392   1392       1392      1392    0      0 mariadbd
        55633743b000 rw-p 0174d000  fd:01 22182     720    416    416        416       384    0      0 mariadbd
    ➜  ~ pmap -X 1210|tail -n 5
        7ffc7142e000 rw-p 00000000  00:00     0     132     76     76         76        76    0      0 [stack]
        7ffc714af000 r-xp 00000000  00:00     0       8      4      0          4         0    0      0 [vdso]
    ffffffffff600000 r-xp 00000000  00:00     0       4      0      0          0         0    0      0 [vsyscall]
                                                ======= ====== ====== ========== ========= ==== ======
                                                1158508 131308 128832     131136    123272    0      0 KB
    

    In computing, proportional set size (PSS) is the portion of main memory (RAM) occupied by a process and is composed by the private memory of that process plus the proportion of shared memory with one or more other processes(由该进程的私有内存加上与一个或多个其他进程的共享内存的比例组成). Unshared memory including the proportion of shared memory is reported as the PSS.

    REF

    top命令显示信息之谜

    Linux top 命令里的内存相关字段(VIRT, RES, SHR, CODE, DATA)

    The /proc Filesystem

    Proportional set size

  • centos刷缓存

    Writing to this will cause the kernel to drop clean caches, as well as
    reclaimable slab objects like dentries and inodes. Once dropped, their
    memory becomes free.

    To free pagecache:
    回收页缓存

    echo 1 > /proc/sys/vm/drop_caches

    To free reclaimable slab objects (includes dentries and inodes):
    释放可回收的 slab 对象(包括目录和 inode)

    echo 2 > /proc/sys/vm/drop_caches

    To free slab objects and pagecache:

    echo 3 > /proc/sys/vm/drop_caches

    [root@node2 ~]# free -m
                  total        used        free      shared  buff/cache   available
    Mem:          48025       27055       12938         838        8031       19708
    Swap:         32767         280       32487
    [root@node2 ~]# echo "3" > /proc/sys/vm/drop_caches
    [root@node2 ~]# free -m
                  total        used        free      shared  buff/cache   available
    Mem:          48025       27052       20032         838         939       19778
    Swap:         32767         280       32487
    

    来源: setting-proc-sys-vm-drop-caches-to-clear-cache)

  • docker运行nginx

    Dockerfile

    FROM nginx
    RUN ["apt-get","update"]
    RUN ["apt-get","install","-y","vim"]
    # 以\ 与 && 符号连接命令,这样执行后,只会创建 1 层镜像.避免镜像膨胀过大
    RUN echo '这是一个本地构建的nginx镜像' > /usr/share/nginx/html/index.html \
            && echo '用来提供一个简单的文件下载功能' >> /usr/share/nginx/html/index.html \
            && mkdir -p /usr/local/nginx_down/ocotpus_http_download
    COPY ./server.conf /etc/nginx/conf.d/default.conf
    #定义匿名数据卷。在启动容器时忘记挂载数据卷,会自动挂载到匿名卷
    #通过 VOLUME 指令创建的挂载点,无法指定主机上对应的目录,是自动生成的
    VOLUME ["/usr/local/nginx_down/ocotpus_http_download/"]
    

    构建镜像

    docker build -t octopus_rpm_nginx .

    运行

    • 端口映射到本机8081
    • 映射了本机的目录/usr/local/nginx_down/ocotpus_http_download/
    • –rm 关闭容器时,删除容器
    docker run -d  -p 192.168.8.73:8081:80 \
    -v /usr/local/nginx_down/ocotpus_http_download/:/usr/local/nginx_down/ocotpus_http_download/ \
    --rm \
    --name rpmnginx  octopus_rpm_nginx
    
  • inode

    来源:理解inode-阮一峰的网络日志

    文件inode

    每一个文件都有对应的inode,里面包含了与该文件有关的一些信息。

    ➜  ~ stat /tmp/hui/x
      File: ‘/tmp/hui/x’
      Size: 0           Blocks: 0          IO Block: 4096   regular empty file
    Device: fd01h/64769d    Inode: 395197      Links: 1
    Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
    Access: 2021-04-12 13:56:18.677116921 +0800
    Modify: 2021-04-12 13:56:18.677116921 +0800
    Change: 2021-04-12 13:56:18.677116921 +0800
     Birth: -
    
    ➜  ~ ls -i /tmp/hui/x
    395197 /tmp/hui/x
    

    由于inode号码与文件名分离,这种机制导致了一些Unix/Linux系统特有的现象。
      1. 有时,文件名包含特殊字符,无法正常删除。这时,直接删除inode节点,就能起到删除文件的作用。
      2. 移动文件或重命名文件,只是改变文件名,不影响inode号码。
      3. 打开一个文件以后,系统就以inode号码来识别这个文件,不再考虑文件名。因此,通常来说,系统无法从inode号码得知文件名。
    第3点使得软件更新变得简单,可以在不关闭软件的情况下进行更新,不需要重启。因为系统通过inode号码,识别运行中的文件,不通过文件名。更新的时候,新版文件以同样的文件名,生成一个新的inode,不会影响到运行中的文件。等到下一次运行这个软件的时候,文件名就自动指向新版文件,旧版文件的inode则被回收。

    ➜  ~ lsblk
    NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    sr0     11:0    1 41.2M  0 rom
    vda    253:0    0   50G  0 disk
    └─vda1 253:1    0   50G  0 part /
    ➜  ~ df -i /dev/vda1
    Filesystem      Inodes  IUsed   IFree IUse% Mounted on
    /dev/vda1      3276800 197965 3078835    7% /
    
    ➜  ~ dumpe2fs -h /dev/vda1|grep "Inode size"
    dumpe2fs 1.42.9 (28-Dec-2013)
    Inode size:           256
    

    硬连接

    • inode 相同
    • 源文件与目标文件的inode号码相同,都指向同一个inode
    • inode信息中有一项叫做”链接数”,记录指向该inode的文件名总数,这时就会增加1。
    • 反过来,删除一个文件名,就会使得inode节点中的”链接数”减1。
    • 当这个值减到0,表明没有文件名指向这个inode,系统就会回收这个inode号码,以及其所对应block区域
    ➜  ~ ln /tmp/hui/x /tmp/hui/x1
    ➜  ~ ls -li /tmp/hui
    total 0
    395197 -rw-r--r-- 2 root root 0 Apr 12 13:56 x
    395197 -rw-r--r-- 2 root root 0 Apr 12 13:56 x1
    
    ➜  ~ stat /tmp/hui/x1
      File: ‘/tmp/hui/x1’
      Size: 0           Blocks: 0          IO Block: 4096   regular empty file
    Device: fd01h/64769d    Inode: 395197      Links: 2
    Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
    Access: 2021-04-12 13:56:18.677116921 +0800
    Modify: 2021-04-12 13:56:18.677116921 +0800
    Change: 2021-04-12 13:57:57.015423754 +0800
     Birth: -
    

    软连接

    软链接与硬链接最大的不同:指向文件名,而不是文件的inode.因此inode不同

    ➜  ~ ln -s /tmp/hui/x /tmp/hui/x2
    ➜  ~ ls -li /tmp/hui
    total 0
    395197 -rw-r--r-- 2 root root  0 Apr 12 13:56 x
    395197 -rw-r--r-- 2 root root  0 Apr 12 13:56 x1
    395212 lrwxrwxrwx 1 root root 10 Apr 12 13:58 x2 -> /tmp/hui/x
    
    

    目录

    任何一个目录的”硬链接”总数=2+子目录总数(含隐藏目录)

    创建目录时,默认会生成两个目录项:”.”和”..”。
    * 前者的inode号码就是当前目录的inode号码,等同于当前目录的”硬链接”;
    * 后者的inode号码就是当前目录的父目录的inode号码,等同于父目录的”硬链接”。

    ➜  ~ stat /tmp/hui
      File: ‘/tmp/hui’
      Size: 4096        Blocks: 8          IO Block: 4096   directory
    Device: fd01h/64769d    Inode: 394833      Links: 2
    Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
    Access: 2021-04-12 13:58:36.718546254 +0800
    Modify: 2021-04-12 13:58:31.312529621 +0800
    Change: 2021-04-12 13:58:31.312529621 +0800
     Birth: -
    
    ➜  ~ mkdir /tmp/hui/huichild
    
    ➜  ~ stat /tmp/hui
      File: ‘/tmp/hui’
      Size: 4096        Blocks: 8          IO Block: 4096   directory
    Device: fd01h/64769d    Inode: 394833      Links: 3
    Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
    Access: 2021-04-12 13:58:36.718546254 +0800
    Modify: 2021-04-12 14:02:40.264280838 +0800
    Change: 2021-04-12 14:02:40.264280838 +0800
     Birth: -