现象
vsan 存储容量变小. 主机出现告警: vsan 数据出现错误
通过linux 的top命令查看进程的内存
CODE
andDATA
需要按F
,然后使用空格键选中,才会显示出来
top -p 1210
PID USER PR NI VIRT RES SHR CODE DATA SWAP S %CPU %MEM TIME+ COMMAND
1210 mysql 20 0 1158504 130804 7736 22472 1052816 0 S 0.0 7.0 8:53.50 mariadbd
36. VIRT -- Virtual Memory Size (KiB)
The total amount of virtual memory used by the task. It includes all code, data and shared libraries plus pages that have been swapped out and pages that have been mapped but not used.
VIRT=CODE+DATA+shared libraries +pages that have been swapped out+pages that have been mapped but not used
27. SWAP -- Swapped Size (KiB)
The non-resident portion of a task's address space.
被 swap-out 的内存页大小
17. RES -- Resident Memory Size (KiB)
The non-swapped physical memory a task is using.
一个任务正在使用的,没有被swap-out 的物理内存
例如下面的例子展示了RES 会包含SHR的匿名mmap
#include <sys/mman.h>
#include <unistd.h>
#include <stdint.h>
int main()
{
/* mmap 50MiB of shared anonymous memory */
char *p = mmap(NULL, 50 << 20, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_SHARED, -1, 0);
/* Touch every single page to make them resident */
for (int i = 0; i < (50 << 20) / 4096; i++) {
p[i * 4096] = 1;
}
/* Let us see the process in top */
sleep(1000000);
return 0;
}
gcc -std=gnu99 main.c
./a
ps -ef|pgrep a.out
339065
top -p 339065
PID USER VIRT RES SHR S %CPU %MEM CODE DATA TIME+ COMMAND
338377 root 55412 51564 51476 S 0.0 0.1 4 180 0:00.01 a.out
4. CODE -- Code Size (KiB)
The amount of physical memory devoted to executable code, also known as the Text Resident Set size or TRS.
可执行代码驻留的物理内存总量,驻存代码集合(Text Resident Set, TRS)
6. DATA -- Data + Stack Size (KiB)
The amount of physical memory devoted to other than executable code, also known as the Data Resident Set size or DRS.
The
DATA
column contains the amount of reserved private anonymous memory. By definition, the private anonymous memory is the memory that is specific to the program and that holds its data. It can only be shared by forking in a copy-on-write fashion. It includes (but is not limited to) the stacks and the heap ((But we will see later that it only partially contains the data segment of the loaded executables)). This column does not contain any piece of information about how much memory is actually used by the program, it just tells us that the program reserved some amount of memory, however that memory may be left untouched for a long time.
- DATA 包括 the stacks and the heap,并且不止包括它们.
- DATA 不能告诉我们程序实际使用多少内存,它只是告诉我们该程序“保留”了一定数量的内存,但是该内存可能会长时间保持不变。
$$ANON = RES – SHR$$ ( ANON 表示在堆上分配的内存.)
$$ANON <= DATA$$ (vm_physic)
21. SHR -- Shared Memory Size (KiB)
The amount of shared memory available to a task, not all of which is typically resident. It simply reflects memory that could be potentially shared with other processes.
任务可用的共享内存量,但并非所有的共享内存都是常驻(resident)的。它(SHR)只是反映了可能与其他进程共享的内存
SHR contains all virtual memory that could be shared with other processes, and RSS contains all memory physically in RAM that is used by the process.
Thus all shared memory currently in RAM is counted both in SHR and in RSS, so SHR + RSS has no meaning since it can contain duplicates counts.(SHR + RSS没有意义,因为他们可能包含重复的项)
cat /proc/1210/statm
289626 32701 1934 5618 0 263204 0//os 内存页大小
getconf PAGESIZE
4096
Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
Field | Content | 与 top 相关字段 | |
---|---|---|---|
size | total program size (pages) | (same as VmSize in status) | $$VIRT=289626*4096/1024=1158504$$ |
resident | size of memory portions (pages) | (same as VmRSS in status) | $$RES=32701*4096/1024 = 130804$$ |
shared | number of pages that are shared | (i.e. backed by a file, same as RssFile+RssShmem in status) | $$SHR=1934*4096/1024=7736$$ |
trs | number of pages that are ‘code’ | (not including libs; broken, includes data segment) | $$CODE=5618*4096/1024=22472$$ |
lrs | number of pages of library | (always 0 on 2.6) | |
drs | number of pages of data/stack | (including libs; broken, includes library text) | $$DATA=263204*4096/1024=1052816$$ |
dt | number of dirty pages | (always 0 on 2.6) |
➜ ~ pmap -X 1210|head -n 5
1210: /usr/sbin/mariadbd
Address Perm Offset Device Inode Size Rss Pss Referenced Anonymous Swap Locked Mapping
556335aee000 r-xp 00000000 fd:01 22182 22472 5228 5228 5172 0 0 0 mariadbd
5563372df000 r--p 015f1000 fd:01 22182 1392 1392 1392 1392 1392 0 0 mariadbd
55633743b000 rw-p 0174d000 fd:01 22182 720 416 416 416 384 0 0 mariadbd
➜ ~ pmap -X 1210|tail -n 5
7ffc7142e000 rw-p 00000000 00:00 0 132 76 76 76 76 0 0 [stack]
7ffc714af000 r-xp 00000000 00:00 0 8 4 0 4 0 0 0 [vdso]
ffffffffff600000 r-xp 00000000 00:00 0 4 0 0 0 0 0 0 [vsyscall]
======= ====== ====== ========== ========= ==== ======
1158508 131308 128832 131136 123272 0 0 KB
In computing, proportional set size (PSS) is the portion of main memory (RAM) occupied by a process and is composed by the private memory of that process plus the proportion of shared memory with one or more other processes(由该进程的私有内存加上与一个或多个其他进程的共享内存的比例组成). Unshared memory including the proportion of shared memory is reported as the PSS.
G1 will try expand the heap if the amount of time you spend doing GC work versus application work is greater than a specific threshold. Note: If your min/max heap are the same, expansion cannot occur.
其实堆的大小已经是固定了, jvm dump不会再扩展.
Linux给各个进程提供相同的虚拟内存空间;这使得进程之间相互独立,互不干扰。实现的方法是采用虚拟内存技术:给每一个进程一定虚拟内存空间,而只有当虚拟内存实 际被使用时,才分配物理内存。
-Xms10g -Xmx10g
, when jvm start, it will ask op-system allocation 10g memory which will be used for heap.And op-system will try to allocate the memory for the JVM (show as VIRT), but system did not promise u it will allocate physical memory, it maybe swap 😉
But u will find the VIRT is still not 10g, that reason is 10g is for heap size, a JVM include much more the heap, for example, stack, permgen(hotspot JDK8, openJDK seems has no permgen, fix me if i am wrong), native stack, code, files etc.
RES
还大[root@node2 octopus]# /usr/lib/jvm/java-11/bin/jhsdb jmap --heap --pid 31821
Attaching to process ID 31821, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0.11+9-LTS
using thread-local object allocation.
Garbage-First (G1) GC with 13 thread(s)
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 19327352832 (18432.0MB)
NewSize = 1363144 (1.2999954223632812MB)
MaxNewSize = 11593056256 (11056.0MB)
OldSize = 5452592 (5.1999969482421875MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 16777216 (16.0MB)
Heap Usage:
G1 Heap:
regions = 1152
capacity = 19327352832 (18432.0MB)
used = 17792765976 (16968.503929138184MB) #这里
free = 1534586856 (1463.4960708618164MB)
92.06002565721671% used
G1 Young Generation:
Eden Space:
regions = 11
capacity = 872415232 (832.0MB)
used = 184549376 (176.0MB)
free = 687865856 (656.0MB)
21.153846153846153% used
Survivor Space:
regions = 8
capacity = 134217728 (128.0MB)
used = 134217728 (128.0MB)
free = 0 (0.0MB)
100.0% used
G1 Old Generation:
regions = 1064
capacity = 18320719872 (17472.0MB)
used = 17490776088 (16680.503929138184MB)
free = 829943784 (791.4960708618164MB)
95.46991717684399% used
[root@node2 octopus]#
[root@node2 octopus]#
[root@node2 octopus]# top -p 31821
top - 16:54:33 up 21:52, 2 users, load average: 0.00, 0.01, 0.05
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 49177608 total, 391092 free, 22373724 used, 26412792 buff/cache
KiB Swap: 33554428 total, 33554428 free, 0 used. 26303500 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31821 root 20 0 159.4g 15.0g 25284 S 0.0 32.0 5:40.87 jsvc
jvm 堆使用了used = 17792765976 (16968.503929138184MB)
进程top中RES确只还是15.0g (RES - SHR)
. 乍一看有点奇怪,但是统计一下jvm 堆里的对象,实际只占了11GB左右.
[root@node2 octopus]# /usr/lib/jvm/java-11/bin/jmap -histo 31821 |head -n 5
num #instances #bytes class name (module)
-------------------------------------------------------
1: 3094 10086075648 [J (java.base@11.0.11)
2: 90396 1081725104 [B (java.base@11.0.11)
3: 11750 203173760 [I (java.base@11.0.11)
[root@node2 octopus]# /usr/lib/jvm/java-11/bin/jmap -histo 31821 |tail -n 5
1661: 1 16 sun.util.locale.provider.TimeZoneNameUtility$TimeZoneNameGetter (java.base@11.0.11)
1662: 1 16 sun.util.logging.internal.LoggingProviderImpl (java.logging@11.0.11)
1663: 1 16 sun.util.resources.LocaleData$LocaleDataStrategy (java.base@11.0.11)
1664: 1 16 sun.util.resources.cldr.provider.CLDRLocaleDataMetaInfo (jdk.localedata@11.0.11)
Total 789125 11397387608 #这里
这说明了jvm dump真实占用除了存活对象之后,还有其他部分. 是不是存储对象所使用的所有region 数量的总和呢?
The virtual memory map contains a lot of stuff.
some of it is shared,
and some of it is allocated but never touched (eg, almost all of the 4Gb of heap in this example).
But the operating system is smart enough to only load what it needs, so the virtual memory size is largely irrelevant.(操作系统只给进程分配它们真实需要使用的内存,因此虚拟内存基本不需要注意)
Where virtual memory size is important is if you’re running on a 32-bit operating system, where you can only allocate 2Gb (or, in some cases, 3Gb) of process address space. In that case you’re dealing with a scarce resource, and might have to make tradeoffs, such as reducing your heap size in order to memory-map a large file or create lots of threads.(以前的机器都是32位的 逻辑寻址最多访问 4GB 内存, 去掉系统保留的,大部分机器上进程只能访问3GB. )
But, given that 64-bit machines are ubiquitous, I don’t think it will be long before Virtual Memory Size is a completely irrelevant statistic.
Resident Set size is that portion of the virtual memory space that is actually in RAM. If your RSS grows to be a significant portion of your total physical memory, it might be time to start worrying. If your RSS grows to take up all your physical memory, and your system starts swapping, it’s well past time to start worrying.
But RSS is also misleading, especially on a lightly loaded machine. The operating system doesn’t expend a lot of effort to reclaiming the pages used by a process. There’s little benefit to be gained by doing so, and the potential for an expensive page fault if the process touches the page in the future. As a result, the RSS statistic may include lots of pages that aren’t in active use. (在轻负载的机器上,操作系统可能不会很及时的回收失效页.因此RSS可能包含很多失效的page)
代码在centos7环境上编译可以跑.但是其他同事在测试.刚好手头上有一台rehl8的环境是空闲的.
在rehl8上编译是成功的.但是程序运行就会崩溃.
日志里面可以看到
fusionsphere/so/libfc.so: undefined symbol: _ZN16CFusionSphereSDK10InitialSDKEPFviPKczEP23CFusionSphereDebugLevelRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESD_SD_
可是程序明明是通过编译没有出错,甚至没有泛型这样的rtti. 明显也不是访问错误内存的段错误.
调优要抓住关键点.缓存也有缓存的弊端.如何让缓存价值更高.硬件层面解决问题.
在处理当前连接时,排队等待连接的最大请求数.
max_connections
服务器支持的最大客户端并发连接数.可以使用show status
查看变量Max_used_connections
table_open_cache
当服务器打开文件时,会试图将它们保持在打开状态,以减少必须要完成的文件打开操作和文件关闭操作的数量.可以通过show global status like 'Opened_tables'
进行评估
Opened_tables
迅速增大,则意味缓存太小与table_open_cache正相关,用于控制存储表定义的缓存大小.
open_files_limit
客户端通信的缓存区的最大值.默认 1MB ,允许的最大值为 1GB.可能还需要相应的增大 下面两个变量
sort_buffer_size 排序操作使用的缓冲区大小
每一个 session 都会受其影响,应该要逐步调整
基本组成
一般调节的参数
缓冲池大小.单位为字节
innodb_buffer_pool_instance
如果innodb_buffer_pool_size>=1GB && innodb_buffer_pool_instance>1,Innodb 会把缓冲池处理成多个小的缓冲池实例.通过随机分配的方式,减少并发竞争.
影响缓冲池缓存失效的参数
弊端