R：关于内存管理的说明

发布时间：2020-12-14 04:34:37 所属栏目：大数据来源：网络整理

导读：假设我有一个矩阵.我需要使用此矩阵的随机子集并将其提供给机器学习算法,例如svm.矩阵的随机子集仅在运行时才知道.此外,还有其他参数也可以从网格中选择. 所以,我的代码看起来像这样： foo = function (bigm,inTrain,moreParamsList) { parsList = c(list(da

假设我有一个矩阵.我需要使用此矩阵的随机子集并将其提供给机器学习算法,例如svm.矩阵的随机子集仅在运行时才知道.此外,还有其他参数也可以从网格中选择.

所以,我的代码看起来像这样：

foo = function (bigm,inTrain,moreParamsList) {
  parsList = c(list(data=bigm[inTrain,]),moreParamsList)
  do.call(svm,parsList)
}

我想知道的是R是否使用新内存来保存parsList中的bigm [inTrain,]对象. (我的猜测确实如此.)我可以使用哪些命令来测试这些假设？另外,有没有一种方法在不使用新内存的情况下在R中使用子矩阵？

编辑：

另外,假设我使用mclapply(在Linux上)调用foo,其中bigm驻留在父进程中.这是否意味着我正在制作mc.cores的bigm副本数量,或者所有核心只使用来自父级的对象？

跟踪内存位置和在不同内核中生成的对象消耗的任何功能和启发式算法？

谢谢.

解决方法

我将在这里放入我对此主题的研究中发现的内容：

我不认为使用mclapply根据多核的手册制作mm.cores的bigm副本：

In a nutshell fork spawns a copy (child) of the current process,that can work in parallel
to the master (parent) process. At the point of forking both processes share exactly the
same state including the workspace,global options,loaded packages etc. Forking is
relatively cheap in modern operating systems and no real copy of the used memory is
created,instead both processes share the same memory and only modified parts are copied.
This makes fork an ideal tool for parallel processing since there is no need to setup the
parallel working environment,data and code is shared automatically from the start.

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!