kvmalloc函数
你应该曾经纠结过是用kmalloc(),还是vmalloc()?现在你不用那么纠结了,因为内核里面现在有个API叫kvmalloc(),可以认为是kmalloc()和vmalloc()的双剑合一。屠龙刀和倚天剑的合体。
内核里面有大量的代码现在都使用了kvmalloc(),譬如:
source/ipc/msg.c
static int newque(struct ipc_namespace *ns, struct ipc_params *params){ struct msg_queue *msq; int retval; key_t key = params->key; int msgflg = params->flg; msq = kvmalloc(sizeof(*msq), GFP_KERNEL); if (unlikely(!msq)) return -ENOMEM; ...}
这个代码在早期的内核里面是(比如v4.0-rc7/source/ipc/msg.c):
static int newque(struct ipc_namespace *ns, struct ipc_params *params){ struct msg_queue *msq; int id, retval; key_t key = params->key; int msgflg = params->flg; msq = ipc_rcu_alloc(sizeof(*msq)); if (!msq) return -ENOMEM; ...}
看起来是用的这个函数申请内存:
ipc_rcu_alloc(sizeof(*msq))
那么这个ipc_rc_alloc()是怎么回事呢?
void *ipc_alloc(int size){ void *out; if (size > PAGE_SIZE) out = vmalloc(size); else out = kmalloc(size, GFP_KERNEL); return out;}
逻辑上是,大于一页的时候用vmalloc(),小于等于1页用kmalloc()。
而kvmalloc()的实现代码里面则对类似逻辑进行了非常智能地处理:
void *kvmalloc_node(size_t size, gfp_t flags, int node){ gfp_t kmalloc_flags = flags; void *ret; /* * vmalloc uses GFP_KERNEL for some internal allocations (e.g page tables) * so the given set of flags has to be compatible. */ if ((flags & GFP_KERNEL) != GFP_KERNEL) return kmalloc_node(size, flags, node); /* * We want to attempt a large physically contiguous block first because * it is less likely to fragment multiple larger blocks and therefore * contribute to a long term fragmentation less than vmalloc fallback. * However make sure that larger requests are not too disruptive - no * OOM killer and no allocation failure warnings as we have a fallback. */ if (size > PAGE_SIZE) { kmalloc_flags |= __GFP_NOWARN; if (!(kmalloc_flags & __GFP_RETRY_MAYFAIL)) kmalloc_flags |= __GFP_NORETRY; } ret = kmalloc_node(size, kmalloc_flags, node); /* * It doesn't really make sense to fallback to vmalloc for sub page * requests */ if (ret || size <= PAGE_SIZE) return ret; return __vmalloc_node_flags_caller(size, node, flags, __builtin_return_address(0));}EXPORT_SYMBOL(kvmalloc_node);static inline void *kvmalloc(size_t size, gfp_t flags){ return kvmalloc_node(size, flags, NUMA_NO_NODE);}
大于一个page的时候,会先用kmalloc()进行__GFP_NORETRY的尝试,如果尝试失败就fallback到vmalloc(NORETRY标记避免了kmalloc在申请内存失败地情况下,反复尝试甚至做OOM来获得内存)。
当然,kvmalloc()的size如果小于1个page,则沿用老的kmalloc()逻辑,而且也不会设置__GFP_NORETRY,如果反复尝试失败的话,也不会fallback到vmalloc(),因为vmalloc()申请小于1个page的内存是不合适的。
没有评论:
发表评论